The Probabilistic
Error Taxonomy
Probabilistic failures don't have stable identities, so classifying them by what went wrong is only half the picture. This taxonomy adds a second dimension — where in the pipeline the failure was introduced — because the corrective action depends on the combination, not either axis alone.
Error type is the what. Origin is the where.
The two axes are orthogonal: the same error type can arise from several origins, and the same origin can produce several error types. That's why the framework is a matrix of cells rather than a one-to-one list — and why it mirrors how quality teams already investigate deviations, identifying both what happened and where in the process it happened.
What went wrong
- Fabrication — citation, data-point, and entity fabrication
- Misinterpretation — recognition errors, reasoning errors, omission
- Contextual misapplication — population mismatch, temporal validity, spec non-compliance
- Confidence miscalibration — overconfidence, hedging, unflagged conflicts
- Boundary violation — scope creep, authority creep, adversarial breach
- Population bias — demographic, safety-signal, and site/transferability bias
Where it was introduced
- Training data — data poisoning, privacy, distribution shift, source error
- Retrieval / RAG layer — chunking, embedding drift, grounded hallucination, multi-hop
- Model inference — hallucination, sycophancy, non-determinism, prompt injection
- Human–AI interface — automation bias, confidence miscommunication, deskilling
- Agent orchestration — multi-step planning, compounding propagation, tool-use
- Supplier — silent model updates, deprecation, multi-tenancy leakage, DPA drift
Same error class, different origin, different fix
A fabrication is the same error class whether it originated at model inference, in the retrieval layer, or at the human review point — but the corrective control is different in each case. The canonical case below is Mata v. Avianca: one fabrication, two origins, two different fixes. Highlighted cells are worked beneath the grid; the fully populated 36-cell matrix is the client deliverable.
| Training data | Retrieval / RAG | Model inference | Human–AI interface | Agent orchestration | Supplier | |
|---|---|---|---|---|---|---|
| Fabrication | ||||||
| Misinterpretation | ||||||
| Contextual misapplication | ||||||
| Confidence miscalibration | ||||||
| Boundary violation | ||||||
| Population bias |
Fabrication × Model inference
The canonical hallucination. In Mata v. Avianca (S.D.N.Y. 2023), the model fabricated six federal case citations in a single inference run — correct citation structure, invented substance — with no grounding in any retrieved source.
Fabrication × Human–AI interface
The same case, one layer out. When the attorney asked whether the cases were real, the model confirmed they were — and no independent check stood between that answer and the filed brief.
Fabrication × Retrieval / RAG
RAG reduces but doesn't eliminate fabrication. Grounded hallucination occurs when the model extrapolates beyond the retrieved context — producing a citation the source doesn't actually support.
Misinterpretation × Model inference
Right data, wrong reasoning. Cabral et al. (2024) documented a model that correctly identified every relevant clinical finding from a vignette, then drew an incorrect inferential conclusion from it.
Contextual misapplication × Training data
Right-looking, wrong-context. A clinical LLM cites a withdrawn or superseded FDA guidance document because its knowledge predates the change (temporal validity / knowledge cutoff).
Contextual misapplication × Supplier
The model changed underneath a validated workflow. Between the March and June 2023 GPT-4 versions, USMLE accuracy fell from 86.6% to 82.1% with no announcement (Chen, Zaharia & Zou, 2023) — enough to break a workflow validated against the earlier endpoint.
Confidence miscalibration × Model inference
Maximum confidence on fabricated content. The Mata model didn't hedge — it asserted the invented cases were real, collapsing uncertainty exactly where it mattered most.
Boundary violation × Model inference
Guardrails that don't hold under pressure. Lee et al. (2025) found prompt-injection attacks succeeded in 94.4% of 216 controlled medical dialogues, including most high-harm trials.
Population bias × Training data
Accurate in aggregate, wrong for the underrepresented. Larrazabal et al. (2020) showed a consistent performance drop for underrepresented genders once training data fell below a minimum balance.
Available in engagement
Every remaining intersection carries its own representative failure mode, detection method, and corrective protocol — the fully populated 36-cell matrix is delivered as part of a client engagement rather than published here.
Where this sits in the architecture
The taxonomy is a Control-Layer component of the House of AI Trust. It gives the Validation Lifecycle's evaluation-design step (03) a structured map of failure modes to test against, and gives the continuous-monitoring step (06) a vocabulary for classifying what it catches.
License the Error Taxonomy
Want to run the full 6 × 6 matrix — every failure mode and control — in your own validation work? Book a licensing conversation.