Project - Skills Classification & Matching Platform

Building a classification and matching platform that maps resumes and job descriptions to standardized occupational models using vector embeddings, graph-based matching, and Document AI.

Client: Curriculo
Year: 2024
Service: Backend Development, Machine Learning

Overview

Curriculo is a platform that matches candidates to positions based on their actual skills and experience rather than keyword overlap. To do this accurately, the platform needs to understand resumes and job descriptions in terms of standardized occupational models — the EU's ESCO taxonomy and the US O*NET model, among others.

I built the classification and matching infrastructure that powers this: from acquiring and embedding the taxonomy data, to the APIs that classify and serve it, to the serverless layer that connects everything to the web application, to the graph database where matched relationships are stored and queried.

Key Contributions

Classification Engine

Built a Python Flask API that classifies free-text queries, resumes, and job descriptions against standardized occupational models. The API uses vector embeddings acquired for the full ESCO and O*NET taxonomies, indexed with Faiss for fast similarity search. This allows the platform to take unstructured text and return accurate, ranked classifications — mapping a job description or resume to the most relevant occupations, skills, and competencies.

Taxonomy Metadata API

Built a second Python API that serves as the single source of truth for all taxonomy data across the occupational models. It exposes exactly the information the platform needs — occupations, skills, relationships, hierarchies — through a GraphQL interface, keeping the data layer clean and the queries efficient.

Serverless Integration Layer

Built a set of TypeScript cloud functions that act as the glue between the web application and the backend APIs. These functions handle orchestration, data transformation, and routing — keeping the web application thin and the backend services decoupled.

Document AI Pipeline

One of the cloud functions implements a Document AI pipeline for processing uploaded resumes and job descriptions. It extracts structured data from documents, which is then fed into the classification engine for taxonomy mapping.

Graph-Based Matching

Classified relationships — between candidates, positions, skills, and occupations — are stored in Neo4j. This graph structure enables in-depth, controlled matching algorithms that go beyond simple keyword or vector similarity, producing highly precise candidate-to-position matches based on the actual taxonomy relationships.

Python
Flask
TypeScript
Faiss
Neo4j
GraphQL
Document AI
Cloud Functions

Taxonomy Models: ESCO + O*NET
Vector Search: Faiss
Graph Matching: Neo4j
Architecture: Serverless