top of page

How I Built a Live UK Data & AI Job Skills Tracker Using Python and AI

  • Writer: dhanasekar palani
    dhanasekar palani
  • Mar 24
  • 3 min read

I have been learning AI recently and wanted to go beyond just following tutorials — so I started building real projects to see how far I could take it. This is one of them.

When I started thinking about what skills to focus on next in my data career, I kept running into the same problem — every "top skills" article was either 18 months old or just someone's opinion. I wanted actual data. Live job postings, extracted and analysed every month. It felt like the perfect project to learn AI properly: a real problem, a real pipeline, and something genuinely useful at the end of it.

So I built it myself. This post walks through how the pipeline works and the architecture behind it.


What it does

The finished product is a live interactive widget embedded on this blog. Every month it shows:

  • The most in-demand technical skills across UK Data & AI job postings

  • Breakdowns by role — Data Analyst, Data Scientist, ML Engineer, AI Engineer, Data Engineer

  • An AI-written narrative summarising the trends in plain English

Everything is pulled from real job postings, processed by AI, and updated with a single script run.


The architecture

The pipeline has five steps:

1. Fetch job postings — Adzuna API Adzuna is a UK job board with a free developer API. For each role I query the API and pull the latest 20 postings, including the full job description. The UK coverage is excellent — it was actually built here in London.


2. Extract skills — Groq (Llama 3.1) Each batch of job descriptions gets sent to the Groq API, which runs Llama 3.1-8b-instant. The prompt asks the model to return a JSON array of technical skills mentioned in the posting. Groq was the right choice here because it is genuinely free, fast, and has no regional restrictions — important since I am based in the UK.


3. Aggregate counts Once skills are extracted per job, I count how many postings mention each skill and sort by frequency. The top 15 skills per role get saved to the output.


4. Generate narrative — Groq again After aggregation, the pipeline calls Groq one more time per role to write a 2–3 paragraph blog-ready analysis of the skill data. This gets saved directly into the JSON so no API calls are needed when a reader loads the page.


5. Output — trends.json Everything lands in a single JSON file that the frontend widget fetches at page load. Hosted on GitHub, deployed via Netlify, embedded in Wix using an iframe.

Tech stack summary

Component

Tool

Cost

Job data

Adzuna API

Free

Skill extraction

Groq (Llama 3.1-8b-instant)

Free

Narrative generation

Groq (Llama 3.1-8b-instant)

Free

Widget hosting

Netlify

Free

Data hosting

GitHub

Free

Blog

Wix Core

Paid

Language

Python 3.11

Free

Total running cost: £0/month for the pipeline itself.


What I would do differently

A few things I would change if I were starting again:

Running the pipeline manually each month works fine for now, but I will eventually set up a GitHub Actions workflow to run it automatically on the first of each month and commit the updated JSON. That removes the one manual step entirely.

I would also increase the number of job postings per role from 20 to 50 from the start. The trends are directionally accurate at 20 but a larger sample would make the skill percentages more reliable.


Try it yourself

The full source code for the pipeline is on my GitHub. If you want to adapt it for a different country or job category, the main things to change are the ROLES dictionary and the where parameter in the Adzuna query.


The live widget is on this page — you can filter by role and see how the skill landscape shifts depending on whether you are looking at analyst roles versus engineering roles.

Built with Python, Groq, Adzuna, Netlify, and a lot of error messages.


Comments


bottom of page