I Turned 23 Years of Project Notes Into an AI Fit Engine for Recruiters

I Turned 23 Years of Project Notes Into an AI Fit Engine for Recruiters
Fit for purpose. Nothing personal.

Part 1: The Business Case

Resumes Don't Answer the Right Question

Your resume is polished and says everything you want it to say. It also tells the recruiter almost nothing they actually need to know.

The question a recruiter actually has is straightforward: does this person fit my role? Can you solve my pain points? Not "do they have a computer science degree" and not "did they work at a recognizable company." The real question is whether this specific person's experience maps to this specific role's requirements, right now, in a way that is worth a phone call.

A resume cannot answer that question. Neither can a LinkedIn profile. Even a custom resume generated with AI for a specific JD is a step forward, but it is still a static document. What if instead you could score a job description against 23 years of detailed project history and then let the recruiter ask follow-up questions in real time? No phone call. No scheduling. Just answers, grounded in evidence, available on demand.

The Tool

So I built chat.kostenko.com. It's protected, so if you're a decision maker or a recruiter ping me. Otherwise you can watch a recording to experience this.

It is an AI-powered fit analysis tool. A recruiter pastes any job description and within seconds receives a scored breakdown of how well my background maps to the role. Not a keyword match, not a summary. A reasoned analysis grounded in 23 years of documented project work (~40-45K tokens), with scores across 7 dimensions and specific proof points pulled from real case studies.

It is always on, it does not need me to be available, and it can answer follow-up questions the recruiter did not know they had. It's a way to probe a candidate with zero commitment. The tool is also privacy preserving. It does not track input our output (but models do of course).

How It Works

Step 1: Get the access code.
The site is passcode-protected. If a recruiter has read my LinkIn profile, they have the code.

Step 2: Paste the JD.
The recruiter pastes the job description directly into the input, or uploads a PDF from their ATS. The tool accepts both.

Step 3: Get the fit score.
The AI reads the JD, compares it against the knowledge base, and returns scores across seven dimensions: technical delivery, AI and emerging technology, program and delivery leadership, solution architecture, client and stakeholder management, product and commercial acumen, and domain alignment. Each score comes with a short explanation and a cited proof point from a real engagement.

Step 4: Ask follow-up questions.
After the analysis, the recruiter can ask anything. Has he managed distributed teams? What does his experience with regulated industries look like? Has he run a vendor selection process before? The AI answers from the knowledge base. If the evidence is not there, it says so clearly. The AI system prompt was designed to answer directly and not to speculate. The tool also suggests follow-up questions to ask to make it easier for the recruiter to explore further.

Step 5: Take action.
If the fit score lands above 70%, the tool presents a direct link to book a call. Weaker fits get a LinkedIn connection prompt instead. The tool qualifies the lead and routes it. No inbox required on my end.

The Knowledge Base: The Part That Actually Makes It Work

The hard part was not building the app. You have Claude for that. The knowledge base is what makes this work. This took quite some time to create.

The tool runs on five structured markdown files I wrote and maintain. Together they contain over 21,700 words and roughly 163KB of content covering 23 years of work in granular, evidence-rich detail. It's about 40-45K tokens. This is not a resume. It is closer to a technical dossier that the AI reasons against every time a recruiter asks a question.

Professional Profile (01) is the anchor document. It covers career arc, actual titles, team sizes, client names, outcomes, and sets the parameters for what the AI is allowed to say, including what kinds of roles I am targeting and which I will not consider.

Projects Portfolio (02) is the largest file at over 9,500 words. It documents 20+ client engagements in structured case study format: problem, role, what was built, technical architecture, outcomes. P&G Braun, Tiffany, Estee Lauder MAC, the Federal Reserve, the English Premier League, the NBA. Each entry exists so the AI can pull specific, credible proof points when a JD calls for relevant experience.

Skills and Expertise (03) maps capabilities to the specific contexts in which they were applied. Not a keyword list. The AI Engineering section alone documents hands-on work with local LLM inference (Ollama, Open WebUI), the Vercel AI SDK, OpenAI's and Anthropic API, MCP server development, prompt engineering, evals, OpenClaw experimentation, and multi-agent orchestration. Each entry links back to something meaningful shipped.

Leadership Philosophy (04) at 3,500+ words covers how I build teams, run delivery under pressure, think about compensation design, and approach organizational problems. This file matters when a VP-level JD asks about leadership style and the AI needs to go deeper than a credential list. My StrengthFinder profiles are part of this repository as well.

Target Job (05) tells the AI how to calibrate the fit score from my perspective, not just the job description's. A role that matches every technical requirement but lands in pure DeFi trading still gets flagged. The AI reasons about alignment in both directions.

The depth of these files is what separates this from a chatbot on top of a LinkedIn profile. The AI is only as good as the evidence it can draw from.


Part 2: How It Was Built

Stack and Why

Next.js 15 App Router, deployed on Vercel. The AI layer uses the Vercel AI SDK v4 with OpenAI as the inference provider to get started. I'm publishing this right on release, thus the model choice and provider is subject to change through learning. The model is configured through an environment variable. Switching between GPT-4.1, GPT-4.1-mini, or any future model requires no code change. The analysis endpoint can run a more capable model while chat runs a lighter one. Both share the same codebase.

Knowledge Base Architecture: Deliberately No Vector Database

The knowledge base does not use embeddings or vector search. All five files are loaded into the system prompt as raw text at inference time. The analysis endpoint uses the full context at approximately 41,000 tokens. The chat endpoint uses a condensed version at roughly 20,000 tokens.

This is a deliberate choice for now. Vector search adds infrastructure complexity, retrieval latency, and non-trivial engineering overhead. The source files are still being updated regularly, and running each iteration through a chunking and embedding pipeline (e.g., VoyageAI) adds friction that slows down the feedback loop. Vector search makes sense when the knowledge base is too large to fit in a single context window. At 160KB, this one fits. The tradeoff is higher token cost per request and a hard ceiling on growth. At the current size, each analysis request consumes roughly 45,000 tokens of input. If the KB doubles, it would start consuming over half the model's context window, at which point retrieval-based architecture would become necessary. For now, simplicity wins and the system is faster to iterate on. The files are bundled directly into the Vercel serverless function at build time using outputFileTracingIncludes. No database, no object storage, no external fetch at runtime.

PDF Upload

The upload endpoint accepts a PDF, extracts text using pdf-parse, and routes it through the same analysis pipeline as a pasted JD. Getting pdf-parse working in the Vercel serverless environment required explicit serverExternalPackages configuration in Next.js because the library depends on native Node bindings that do not bundle cleanly by default.

NSFW Filter

Every input passes through OpenAI's moderation API before reaching inference. A flagged input returns a 451 status code, the client renders a block screen, and a flag is written to localStorage. The block is "permanent" by design. Two deliberate tradeoffs: fail-open if the moderation API is unavailable (availability over perfect enforcement), and one-way door on confirmed violations.

Passcode Gate

Validated server-side on every API route via a custom request header, because client state can be spoofed. A server-side header check cannot be bypassed without knowing the code. The gate is not a security system. It filters random internet traffic and signals to recruiters that they need to engage with the LinkedIn profile to access the tool. This is not sophisticated, but it is a reasonable bar for this experiment.

Streaming Responses

Both endpoints stream using toDataStreamResponse(). The client renders tokens in real time via the useChat hook. Users consistently perceive streaming interfaces as faster even when total time to completion is identical. For a tool where first meaningful output takes 2 to 3 seconds to generate, streaming makes the experience feel immediate rather than frozen.

Session Limit as Product Design

The chat interface caps sessions at 20 questions with a warning at 15. This is a product decision, not a technical one. A cap creates light urgency and gives the session a natural end point. The warning at 15 functions as a soft call to action. The intent is for the session limit to push a qualified recruiter toward booking a real call before they run out of questions. That is the outcome the whole tool is built toward.


Demo

I created a 4 minute demo of the app here. Feedback is a gift and is always welcome. Message me on LinkedIn.