Documentation lies. Here's how we built a verifier that runs on every commit
TL;DR: Every codebase silently accumulates documentation debt. We built a system that treats each numerical claim as a verifiable contract — no evidence, no commit. Result: from 33 unbacked claims to zero drift detected across 9 kernel domains.
The number that made me uncomfortable #
First dry-run: 33 unbacked claims in CAPABILITY_REGISTRY.mdThat was the output from the first Evidence Gate run on 2026-04-09. Not errors. Not bugs. Thirty-three assertions — "24 agents", "16 PII patterns", "6 domains covered" — with zero automated proof they were still true.
They weren't lies. They were accurate when written. The problem is that code kept evolving and the docs stayed behind. In five weeks of intense development, 33 claims had drifted without me noticing.
What bothered me wasn't the 33. It was realizing how many other projects — including my own — probably have twice that and have never measured it.
Why this keeps happening #
Documentation has the same problem as manual tests: it works until the first change nobody tracked.
When you write "the system has 24 agents," you're making a claim. Without a mechanism to verify that claim against real code, it will diverge. It's not a question of discipline — it's inevitable at any non-trivial velocity.
The typical pattern of how it becomes invisible:
| Step | What happens | Result |
|---|---|---|
| 1 | Dev ships feature + updates docs in the same PR | Correct |
| 2 | Second dev adds a smaller feature, touches only code | Doc stale +1 |
| 3 | Third dev removes deprecated feature, touches only code | Doc stale +1 |
| 4 | README still shows the original number. Nobody checked. | Invisible drift |
Solo projects make it worse: the same person who wrote the doc assumes they remember what changed. They don't. Karpathy has a principle here: activity is not progress. Writing "24 agents" in your README when you have 31 is documentation activity that produces confusion, not clarity.
What we built: Doc-Drift Shield in three layers #
Layer 1: .egos-manifest.yaml — the contract
Every numerical claim in code or docs needs a manifest entry with the command that proves it:
claims:
- id: total_agents
description: "Agents registered in agents.json"
command: "python3 -c 'import json; data=json.load(open(\"agents/registry/agents.json\")); print(len(data.get(\"agents\", [])))'"
tolerance: "min:18"
last_value: "24"The command field is the proof. A shell command that, run right now, returns the real current value. If the number has drifted beyond tolerance, it's drift. The manifest lives at the repo root and is consumed by two things: the pre-commit hook and the daily cron.
Layer 2: scripts/evidence-gate.ts — the verifier
Runs in two contexts:
- In the pre-commit hook,
--staged-onlymode: checks only the files you're committing - Daily cron (00:17 BRT): checks the entire repo and writes
docs/jobs/YYYY-MM-DD-doc-drift-verifier.json
bun scripts/evidence-gate.ts
# Current output (2026-04-16):
{
"total_claims": 15,
"passed": 15,
"drifted": 0,
"domains_ok": 9
}Layer 3: Progressive activation #
Week 1 (warning mode): shows violations, lets commits through. You learn what's breaking without blocking work.
Week 2+ (blocking mode): commit fails if drift is detected in kernel docs. Override available via DOC-DRIFT-ACCEPTED: reason in the commit body — but it has to be justified.
Real cases from the first run #
| Claim found | Doc value | Real value | Action |
|---|---|---|---|
| Guard Brasil endpoint | /v1/check | /v1/inspect | Doc corrected |
| Registered agents | 24 | 31 | Manifest updated |
| Repo path | /home/enio/br-acc | /home/enio/intelink | Propagated to 4 docs |
| Kernel packages | 8 | 19 | Claim rewritten |
All four were invisible before the Evidence Gate. None caused a production bug — but two of them (the Guard Brasil endpoint and the intelink path) would have confused anyone trying to use the docs to integrate the system. The wrong endpoint especially: someone trying /v1/check would have gotten a 404 with no hint of what happened.
What the manifest covers today #
| Claim | Tolerance | Last value |
|---|---|---|
| Total agents | min:18 | 24 |
| Declared capabilities | ±10 | 27 |
| Governance files | ±5 | 59 |
| Kernel packages | ±2 | 19 |
| 30d commits (all repos) | min:50 | 1213 |
What didn't work #
- Fragile commands: verifications that depend on local environment fail silently in CI. Open problem: containerize the verifier.
- Pre-commit hooks have a bypass:
--no-verifystill exists. The daily cron is the real backstop — without it, the system is easy to circumvent. - Overhead for smaller projects: makes sense for 15+ claims. For a README with 3 statements, it's overkill.
- Commands need their own proof: if a verification command is wrong, the manifest passes even with real drift. The meta-problem doesn't have an elegant solution yet.
Open questions #
- How to connect the manifest to CI/CD without depending on local environment?
- Is it possible to auto-generate manifest entries from markdown assertions?
- How to handle qualitative claims ("the system is fast") that don't have a direct number?
- Should the evidence gate block PRs on GitHub, not just local commits?
Files referenced in this article #
- .egos-manifest.yaml — verifiable claims contract for the EGOS kernel
- scripts/evidence-gate.ts — verifier that runs in pre-commit and the daily cron
- docs/CAPABILITY_REGISTRY.md — canonical capability registry (primary drift entry point)
- .guarani/RULES_INDEX.md — kernel governance rules, also monitored by the gate
Related in EGOS #
- Wrong Altitude — how to build at the right abstraction level so you don't create invisible debt from the start
- Guard Brasil: 16 PII Patterns in 4ms — the system whose endpoint the manifest now monitors automatically
Open source. Everything here is available at github.com/enioxt/egos. If you're building something similar or want to apply this in your context, reach out on X: @eniorocha_. Building in public.