Documentation lies. Here's how we built a verifier that runs on every commit

E
Enio Rocha
5 min read GitHub →

TL;DR: Every codebase silently accumulates documentation debt. We built a system that treats each numerical claim as a verifiable contract — no evidence, no commit. Result: from 33 unbacked claims to zero drift detected across 9 kernel domains.

The number that made me uncomfortable #

First dry-run: 33 unbacked claims in CAPABILITY_REGISTRY.md

That was the output from the first Evidence Gate run on 2026-04-09. Not errors. Not bugs. Thirty-three assertions — "24 agents", "16 PII patterns", "6 domains covered" — with zero automated proof they were still true.

They weren't lies. They were accurate when written. The problem is that code kept evolving and the docs stayed behind. In five weeks of intense development, 33 claims had drifted without me noticing.

What bothered me wasn't the 33. It was realizing how many other projects — including my own — probably have twice that and have never measured it.

Why this keeps happening #

Documentation has the same problem as manual tests: it works until the first change nobody tracked.

When you write "the system has 24 agents," you're making a claim. Without a mechanism to verify that claim against real code, it will diverge. It's not a question of discipline — it's inevitable at any non-trivial velocity.

The typical pattern of how it becomes invisible:

StepWhat happensResult
1Dev ships feature + updates docs in the same PRCorrect
2Second dev adds a smaller feature, touches only codeDoc stale +1
3Third dev removes deprecated feature, touches only codeDoc stale +1
4README still shows the original number. Nobody checked.Invisible drift

Solo projects make it worse: the same person who wrote the doc assumes they remember what changed. They don't. Karpathy has a principle here: activity is not progress. Writing "24 agents" in your README when you have 31 is documentation activity that produces confusion, not clarity.

What we built: Doc-Drift Shield in three layers #

Layer 1: .egos-manifest.yaml — the contract

Every numerical claim in code or docs needs a manifest entry with the command that proves it:

claims:
  - id: total_agents
    description: "Agents registered in agents.json"
    command: "python3 -c 'import json; data=json.load(open(\"agents/registry/agents.json\")); print(len(data.get(\"agents\", [])))'"
    tolerance: "min:18"
    last_value: "24"

The command field is the proof. A shell command that, run right now, returns the real current value. If the number has drifted beyond tolerance, it's drift. The manifest lives at the repo root and is consumed by two things: the pre-commit hook and the daily cron.

Layer 2: scripts/evidence-gate.ts — the verifier

Runs in two contexts:

  • In the pre-commit hook, --staged-only mode: checks only the files you're committing
  • Daily cron (00:17 BRT): checks the entire repo and writes docs/jobs/YYYY-MM-DD-doc-drift-verifier.json
bun scripts/evidence-gate.ts

# Current output (2026-04-16):
{
  "total_claims": 15,
  "passed": 15,
  "drifted": 0,
  "domains_ok": 9
}

Layer 3: Progressive activation #

Week 1 (warning mode): shows violations, lets commits through. You learn what's breaking without blocking work.

Week 2+ (blocking mode): commit fails if drift is detected in kernel docs. Override available via DOC-DRIFT-ACCEPTED: reason in the commit body — but it has to be justified.

Real cases from the first run #

Claim foundDoc valueReal valueAction
Guard Brasil endpoint/v1/check/v1/inspectDoc corrected
Registered agents2431Manifest updated
Repo path/home/enio/br-acc/home/enio/intelinkPropagated to 4 docs
Kernel packages819Claim rewritten

All four were invisible before the Evidence Gate. None caused a production bug — but two of them (the Guard Brasil endpoint and the intelink path) would have confused anyone trying to use the docs to integrate the system. The wrong endpoint especially: someone trying /v1/check would have gotten a 404 with no hint of what happened.

What the manifest covers today #

ClaimToleranceLast value
Total agentsmin:1824
Declared capabilities±1027
Governance files±559
Kernel packages±219
30d commits (all repos)min:501213

What didn't work #

  • Fragile commands: verifications that depend on local environment fail silently in CI. Open problem: containerize the verifier.
  • Pre-commit hooks have a bypass: --no-verify still exists. The daily cron is the real backstop — without it, the system is easy to circumvent.
  • Overhead for smaller projects: makes sense for 15+ claims. For a README with 3 statements, it's overkill.
  • Commands need their own proof: if a verification command is wrong, the manifest passes even with real drift. The meta-problem doesn't have an elegant solution yet.

Open questions #

  • How to connect the manifest to CI/CD without depending on local environment?
  • Is it possible to auto-generate manifest entries from markdown assertions?
  • How to handle qualitative claims ("the system is fast") that don't have a direct number?
  • Should the evidence gate block PRs on GitHub, not just local commits?

Files referenced in this article #

Open source. Everything here is available at github.com/enioxt/egos. If you're building something similar or want to apply this in your context, reach out on X: @eniorocha_. Building in public.