How I Built a Live UK Data & AI Job Skills Tracker Using Python and AI

dhanasekar palani
Mar 24
3 min read

I have been learning AI recently and wanted to go beyond just following tutorials — so I started building real projects to see how far I could take it. This is one of them.

When I started thinking about what skills to focus on next in my data career, I kept running into the same problem — every "top skills" article was either 18 months old or just someone's opinion. I wanted actual data. Live job postings, extracted and analysed every month. It felt like the perfect project to learn AI properly: a real problem, a real pipeline, and something genuinely useful at the end of it.

So I built it myself. This post walks through how the pipeline works and the architecture behind it.

What it does

The finished product is a live interactive widget embedded on this blog. Every month it shows:

The most in-demand technical skills across UK Data & AI job postings
Breakdowns by role — Data Analyst, Data Scientist, ML Engineer, AI Engineer, Data Engineer
An AI-written narrative summarising the trends in plain English

Everything is pulled from real job postings, processed by AI, and updated with a single script run.

The architecture

The pipeline has five steps:

1. Fetch job postings — Adzuna API Adzuna is a UK job board with a free developer API. For each role I query the API and pull the latest 20 postings, including the full job description. The UK coverage is excellent — it was actually built here in London.

2. Extract skills — Groq (Llama 3.1) Each batch of job descriptions gets sent to the Groq API, which runs Llama 3.1-8b-instant. The prompt asks the model to return a JSON array of technical skills mentioned in the posting. Groq was the right choice here because it is genuinely free, fast, and has no regional restrictions — important since I am based in the UK.

3. Aggregate counts Once skills are extracted per job, I count how many postings mention each skill and sort by frequency. The top 15 skills per role get saved to the output.

4. Generate narrative — Groq again After aggregation, the pipeline calls Groq one more time per role to write a 2–3 paragraph blog-ready analysis of the skill data. This gets saved directly into the JSON so no API calls are needed when a reader loads the page.

5. Output — trends.json Everything lands in a single JSON file that the frontend widget fetches at page load. Hosted on GitHub, deployed via Netlify, embedded in Wix using an iframe.

Tech stack summary

Component	Tool	Cost
Job data	Adzuna API	Free
Skill extraction	Groq (Llama 3.1-8b-instant)	Free
Narrative generation	Groq (Llama 3.1-8b-instant)	Free
Widget hosting	Netlify	Free
Data hosting	GitHub	Free
Blog	Wix Core	Paid
Language	Python 3.11	Free

Total running cost: £0/month for the pipeline itself.

What I would do differently

A few things I would change if I were starting again:

Running the pipeline manually each month works fine for now, but I will eventually set up a GitHub Actions workflow to run it automatically on the first of each month and commit the updated JSON. That removes the one manual step entirely.

I would also increase the number of job postings per role from 20 to 50 from the start. The trends are directionally accurate at 20 but a larger sample would make the skill percentages more reliable.

Try it yourself

The full source code for the pipeline is on my GitHub. If you want to adapt it for a different country or job category, the main things to change are the ROLES dictionary and the where parameter in the Adzuna query.

The live widget is on this page — you can filter by role and see how the skill landscape shifts depending on whether you are looking at analyst roles versus engineering roles.

Built with Python, Groq, Adzuna, Netlify, and a lot of error messages.