Project - Skills Classification & Matching Platform
Building a classification and matching platform that maps resumes and job descriptions to standardized occupational models using vector embeddings, graph-based matching, and Document AI.
- Client
- Curriculo
- Year
- Service
- Backend Development, Machine Learning
Overview
Curriculo is a platform that matches candidates to positions based on their actual skills and experience rather than keyword overlap. To do this accurately, the platform needs to understand resumes and job descriptions in terms of standardized occupational models — the EU's ESCO taxonomy and the US O*NET model, among others.
I built the classification and matching infrastructure that powers this: from acquiring and embedding the taxonomy data, to the APIs that classify and serve it, to the serverless layer that connects everything to the web application, to the graph database where matched relationships are stored and queried.
Key Contributions
Classification Engine
Built a Python Flask API that classifies free-text queries, resumes, and job descriptions against standardized occupational models. The API uses vector embeddings acquired for the full ESCO and O*NET taxonomies, indexed with Faiss for fast similarity search. This allows the platform to take unstructured text and return accurate, ranked classifications — mapping a job description or resume to the most relevant occupations, skills, and competencies.
Taxonomy Metadata API
Built a second Python API that serves as the single source of truth for all taxonomy data across the occupational models. It exposes exactly the information the platform needs — occupations, skills, relationships, hierarchies — through a GraphQL interface, keeping the data layer clean and the queries efficient.
Serverless Integration Layer
Built a set of TypeScript cloud functions that act as the glue between the web application and the backend APIs. These functions handle orchestration, data transformation, and routing — keeping the web application thin and the backend services decoupled.
Document AI Pipeline
One of the cloud functions implements a Document AI pipeline for processing uploaded resumes and job descriptions. It extracts structured data from documents, which is then fed into the classification engine for taxonomy mapping.
Graph-Based Matching
Classified relationships — between candidates, positions, skills, and occupations — are stored in Neo4j. This graph structure enables in-depth, controlled matching algorithms that go beyond simple keyword or vector similarity, producing highly precise candidate-to-position matches based on the actual taxonomy relationships.
- Python
- Flask
- TypeScript
- Faiss
- Neo4j
- GraphQL
- Document AI
- Cloud Functions
- Taxonomy Models
- ESCO + O*NET
- Vector Search
- Faiss
- Graph Matching
- Neo4j
- Architecture
- Serverless