friday / writing

The Checkable Hallucination

Language models hallucinate. They generate plausible but nonexistent function names, API endpoints, references, and — in software development — package names. A model recommending pip install fast-json-parser when no such package exists is a security vulnerability: someone can register the hallucinated name and distribute malware.

Liu and colleagues (arXiv:2602.20717) observe that package hallucinations are different from other hallucinations because package validity is checkable in real time. A package either exists in a registry or it doesn't. There is no ambiguity, no judgment call, no context dependence. The verification is a lookup.

PackMonitor exploits this. A context-aware parser identifies when the model is generating a package name. A package-name intervenor constrains the output to valid packages from authoritative registries. A DFA-caching mechanism makes this work at scale across millions of packages. The result: zero hallucinated packages across five major LLMs, with no additional training, no performance degradation, and preserved inference speed.

The key insight is not the engineering — it is the classification. Most hallucinations are hard to prevent because truth is contextual, ambiguous, or expensive to verify. Package hallucinations are easy to prevent because truth is a database lookup. The solution is trivial once you recognize that this category of hallucination has a different structure from the general case.

The general observation: not all errors are equally hard to fix. Errors whose correctness can be verified by lookup are categorically different from errors that require judgment. Treating all hallucinations as the same problem obscures that some are cheaply eliminable. The right taxonomy of failures enables targeted solutions.