The Britt Probabilistic
Validation Lifecycle
Seven steps that take a probabilistic AI system from intended use to audit-ready evidence — the execution methodology that runs inside Layer 3 of the House of AI Trust. Same skeleton as the GAMP 5 V-model; different muscle for systems whose outputs vary.
Seven steps, one continuous cycle
The lifecycle is not a one-time gate. Steps 1–5 establish the system; monitoring and change control feed back into context of use whenever the model, its data, or its operating context shifts.
Context of Use
Who uses the model, for what decision, with which failure modes? The context of use fixes the intended purpose and the consequences of getting it wrong — everything downstream is scoped from here.
Risk Tiering
Risk is tiered by consequence to patient safety and data integrity — not by software category. The tier sets how much evidence each later step must produce, so a high-consequence use earns deeper validation than a low-stakes one.
Evaluation Design
Traceable, domain-specific evaluation sets with lineage (ALCOA+), leakage controls, and real edge cases. The Error Taxonomy maps candidate failure modes to consequence-weighted outcomes so the eval targets what actually matters.
Acceptance Criteria
Probabilistic systems don't pass or fail on a single run. Acceptance criteria are defined as bounded-variance thresholds calibrated to the risk tier — what range of behaviour is acceptable, and how confident must we be in it.
HITL / HOTL Controls
Human-in-the-loop and human-on-the-loop controls are designed to match the risk tier — where a person reviews, where they supervise, and where the system runs unattended. Oversight is engineered, not assumed.
Continuous Monitoring
Post-deployment, performance is monitored against the acceptance criteria with drift detection on inputs, outputs, and the operating context. Monitoring turns validation from a moment into a standing state.
Change Control
When the model, data, or context changes — or a monitoring trigger fires — change control decides what must be re-evaluated and routes the system back to the relevant step. This is the loop that closes back to context of use.
Rigor scales with the risk tier
The same seven steps apply whether the stakes are exploratory or GxP-critical — what changes is the depth of evidence each step demands.
Discovery sprint
Fast cycle, lighter documentation. Establish context of use and acceptance criteria, evaluate against a representative set, ship with eyes open.
Regulated validation
Complete verification and validation with traceable evaluation sets, ALCOA+ lineage, HITL/HOTL controls, and audit trails for every step.
Drift watch
Continuous monitoring with drift detection, revalidation triggers, and change control running as an ongoing operational discipline.
Where this sits in the architecture
The lifecycle is the execution engine inside the Control Layer of the House of AI Trust. The Error Taxonomy supplies the failure classification that the evaluation-design and monitoring steps depend on; VALID Trust handles the supplier-qualification thread that runs horizontally through every layer.