Blog — flagship essay

AI hiring that explains itself.

Why reasoning beats scoring — for the recruiter, the candidate, and the regulator.

By Jon Senger · Founder, CertAIn · April 2026 · 7-minute read

The scored candidate

Here's an industry artifact most people in recruiting have seen:

Candidate: Maria Ruiz
Match score: 74/100

That's it. That's the output of almost every AI-powered candidate-screening feature on the market today. A number. Maybe a colored bar. Maybe five sub-scores — "Skills: 82, Experience: 68, Education: 71, Location: 90, Recency: 59" — but no explanation of what any of those numbers mean, how they were calculated, or what a recruiter is supposed to do about them.

Ask the vendor how the score was produced and the answer is: proprietary model. Ask them why Maria got a 74 and not an 82, and there's no answer that doesn't require a data scientist in the room. Ask what would happen if Maria sued on the basis of an AEDT violation, and the vendor sends you to legal.

A score is not judgment. A score is a shape that looks like judgment.

What a senior recruiter actually does

A senior recruiter, reading Maria's resume for thirty seconds, produces something structurally different from a score. They produce a small piece of prose in their head that goes something like:

Maria has done the work — she ran account expansion at a direct competitor of the company hiring, which is as close to on-the-nose as it gets. Her last two titles were director-level, so she'll want a principal or senior IC title here unless we can position this role as strategic. Main gap: she hasn't managed a multi-country team, and that's in the JD. I'd want to know whether that's a career goal for her or a no-go.

Three things are happening in that paragraph that a score cannot do:

1. Claims grounded in the resume. Every assertion traces back to something Maria wrote — her prior company, her titles, her work history.
2. A gap named honestly. The recruiter says what's missing, not just what's present.
3. A question for the next conversation. The paragraph doesn't just evaluate — it proposes what to do next.

That is the unit of real judgment in recruiting. A score has zero of those three things. A paragraph has all three.

The case for reasoning

When we say "AI hiring that explains itself," we mean building the AI to produce exactly that paragraph — not a score, not a percentage, not a ranked list without explanations. A paragraph the recruiter can read, agree or disagree with, and act on.

Three audiences benefit from reasoning in ways they do not benefit from scoring:

### The recruiter
A recruiter given a score has to decide whether to trust it. That requires either blind faith (dangerous) or re-reading the resume to verify (which defeats the point of having the tool). A recruiter given a paragraph can verify the reasoning in five seconds — does the paragraph match what they see in the resume? — and override confidently when it doesn't. The tool becomes a second opinion the recruiter can argue with, not a black box the recruiter has to obey or ignore.

### The candidate
Eventually, the reasoning becomes something the candidate can see. "You weren't advanced because you haven't managed a multi-country team, and that's a JD requirement" is a rejection a candidate can learn from — and challenge if they actually have managed one and the resume didn't surface it. "You got a 62" is neither. Candidate-facing reasoning is the next generation of this feature (it's on our roadmap, it's not live yet); the substrate has to exist first, and it does.

### The regulator
NYC AEDT, Illinois AIVIA, and the EU AI Act don't ask for match scores. They ask for evidence of human oversight, documentation of what the AI did, and — in AEDT's case — a bias audit data set. A paragraph of reasoning is already the documentation they want; a score is a rounding of that documentation to a number and has to be expanded back out before any regulator will accept it.

The CertAIn output, as it actually ships

Here's the Ranking output CertAIn produces for a candidate on a real JD. This is what lands in a recruiter's queue after a Ranking action runs:

#3 of 47 — Strong fit, one meaningful gap.

Candidate has seven years of B2B SaaS account-executive experience selling into mid-market, which matches the core of this role. She's closed deals in the $50K–$250K ACV band — below the $500K+ range you've set as target — so expect a ramp on deal size if she moves into enterprise motion here. Two years of vertical-SaaS experience in adjacent markets (fintech, legal), zero direct healthcare experience, but she ran an RFP cycle against a healthcare-specific competitor in 2024 which suggests familiarity with the buyer.

Probe in the interview: her experience managing a multi-threaded evaluation with procurement (a JD requirement not visible in the resume), and how she plans to ramp into healthcare-specific buyer motion in the first two quarters.

If advanced: ask for two reference calls — one from a customer she closed at $250K+ and one from a channel partner.

Notice what's in there:

  • The ranking (#3 of 47), with a one-line summary framing.
  • Specific, grounded claims (seven years, $50K–$250K band, 2024 RFP).
  • A gap named explicitly and separately from the strengths.
  • Two concrete things to probe in the next conversation.
  • A recommendation for the step after, if the candidate advances.

Now imagine a stack of 47 of those paragraphs, ordered by fit. That's what the Ranking output looks like after a single credit-denominated action on a real JD. A recruiter reading the top ten can make a confident pass-or-reject call in under a minute per candidate — not because they skipped the work, but because the work was done in a form they can read and verify.

What this costs, in the mundane sense

A common objection when we show people this: "isn't that slower / more expensive than a score?" The honest answer is: slightly, yes, per-candidate. The model has to do more work to generate the paragraph than to generate a number. At today's costs, that's fractions of a cent per candidate — which for a credit-based pricing model at our scale works out to 1 credit per Ranking action.

For a role with 400 applicants, that's 400 credits — which on a Growth plan ($249 for 1,000 credits/month) is about a third of one month's allocation. Two full searches fit comfortably in that plan. On a per-hire basis, that's dollars, not a meaningful line item on cost-per-hire.

The cost of not doing it — in bad decisions advanced, good candidates rejected, recruiter hours spent in bed, and compliance exposure if things go wrong — is substantially larger. We've done the math for a handful of real pipelines; we're happy to do it for yours.

The category wedge

Most AI-in-hiring coverage in the last two years has been defensive — bias risk, hallucination risk, compliance risk. The vendor response has been to hide the AI, present scores, refuse to expose reasoning. That's a short-term move that makes the long-term regulatory posture worse, not better.

CertAIn's bet is the opposite. Lean into reasoning. Make every output a paragraph a recruiter can defend, a candidate can eventually see, and a regulator can read as documentation. This isn't a feature — it's the category. "AI hiring that explains itself."

Every page of our product reinforces it. Every comparison we draw against competitors hinges on it. And the competitor that matches us on it has to rebuild their product — because they architected around scores, and we architected around reasoning.

Try it

The fastest way to see the difference is to run a single Ranking action on a real JD you have open. The free trial is 30 days with 100 credits and no card. That's enough to rank 100 candidates against one JD and see the paragraphs we're describing. We're taking design partners right now.

Related reading

Take CertAIn for a run on a real JD.