Why AI Governance Fails Without a Control Layer: A House-of-AI-Trust Model for Regulated Drug Development

Mar 14

Trust isn't the prerequisite for AI adoption. It's the deliverable.

That single shift in framing reorganizes most of the work. If trust is what you have to demonstrate before deploying AI, validation becomes a checkpoint at the end of development. If trust is what you have to engineer into the system, validation becomes architectural — present at every layer of how the AI is governed, not just at the moment it goes live.

Regulators have made the demand clear. AI systems used in drug development must be fit for purpose, risk-appropriate, and governed across their lifecycle. What regulators have deliberately not specified is how that demonstration should be built. The FDA's 7-step credibility assessment, the GAMP AI Guide, the FDA-EMA Good AI Practice Principles, the EU's draft Annex 22, the FDA CSA final guidance — each names what a sponsor must show. None tells the sponsor how to construct it.

That gap is where validation architecture lives. The House of AI Trust is the architecture I use to organize the work.

The problem the architecture exists to address

Before introducing the layers, it's worth being explicit about what they're responding to. AI systems in regulated drug development fail in two structurally different ways, and most QMS frameworks were built to catch only one.

Evidence failures are problems with the data the system is fed: unrepresentative training data, mislabeled historical records, missing failure modes in the training set, drift from process changes, OCR artifacts in scanned source documents. These are upstream failures. They're data governance problems.

Inference failures are problems with how the system reasons over that data: hallucinated outputs, confidence miscalibration, sycophantic agreement with user bias, pattern matching without causation, context misunderstanding. These are downstream failures. They're AI governance problems.

Each requires fundamentally different controls. The architecture below is organized so that each governance domain has explicit ownership and the controls that respond to each failure type have an explicit home.

Layer 1: Trust Infrastructure — Makes AI possible

The foundation. Two things must be in place before any AI system can be defensibly deployed in a regulated environment: data governance and regulatory landscape awareness.

Data governance is the layer that addresses evidence failures. ALCOA+ data integrity, provenance documentation, training-test separation, cohort representativeness, labeling QC, batch effect controls — these are mature disciplines with established frameworks and established owners. Most organizations have some version of this in place from their pre-AI work. What changes with AI is that the data being governed is no longer just operational data being recorded; it includes training data, evaluation data, and production inference data, each with its own provenance and integrity requirements.

Regulatory landscape awareness sits alongside data governance because the regulatory expectations are moving fast enough that yesterday's compliance posture isn't reliable. Recent landmarks — the GAMP AI Guide (July 2025), the FDA-EMA Good AI Practice Principles (January 2026), the EU's draft Annex 22 (July 2025), the FDA CSA final guidance — each shifts what defensible AI deployment requires. Layer 1 is where an organization tracks these signals and translates them into internal policy before they become inspection findings.

This layer doesn't make AI defensible on its own. It makes AI possible by establishing the foundation everything else rests on.

Several developments over the past year illustrate this shift:

“Artificial Intelligence and Medicinal Products” (March 2024, updated February 2025)
The EU AI Act (June 2024)
EMA Reflection Paper AI in the Medicinal Product Lifecycle (September 9, 2024)
FDA Draft Guidance - AI in Drug Regulatory Decision-Making (January 7, 2025)
EMA First AI Qualification Opinion (March 2025)
FDA “Elsa” Launch (June 2025)
GAMP Artificial Intelligence Guide (July 2025)
FDA internal deployment of agentic AI (December 1, 2025)
CIOMS WG XIV Final report on AI in Pharmacovigilance (December 4, 2025)
FDA–EMA Good AI Practice Principles (January 14, 2026)

If regulatory landscape defines what trust looks like, data governance determines whether trust is even possible.

Every AI system in regulated science inherits the integrity of the data it was trained, fine-tuned, and evaluated on. When that data is well-governed — known provenance, documented lineage, controlled access, complete metadata, and ALCOA+ alignment — validation has something to stand on. When it isn't, validation is structurally impossible regardless of how rigorous the downstream protocols are. You cannot validate a model whose training data you cannot account for.

The Pistoia Alliance's December 2025 finding that more than 1 in 4 life sciences professionals do not know what data their AI systems use is not a data quality problem. It is a governance gap, and it becomes a regulatory gap the moment those models touch GxP-regulated workflows. 21 CFR Part 11 and EU GMP Annex 11 do not pause their requirements because the system in question is probabilistic. The expectation that electronic records be attributable, legible, contemporaneous, original, and accurate applies whether the record is a batch ledger or a training corpus.

Data governance is not a precondition organizations can defer to later phases of AI adoption. It is the load-bearing pillar that determines whether everything above it can hold weight.

Layer 2: AI Governance — Makes AI manageable

If Layer 1 is the data foundation, Layer 2 is the AI-specific governance that sits on top of it. This is where inference failures become the responsibility of an explicit ownership structure.

The components: model inventory (every AI/ML system touching a GxP process, including the ones nobody thinks of as "AI" — predictive maintenance, document summarization, deviation auto-classification, visual inspection), policy (acceptable use, prohibited use, escalation paths), accountability (who owns the model, who approves changes, who is responsible when something goes wrong), and explainability requirements (what level of interpretability is required for what risk tier).

Most organizations adopting AI in 2026 don't have a complete Layer 2. They have model inventories that miss systems, policies that predate generative AI, accountability structures that haven't caught up to autonomous workflows, and explainability requirements borrowed from other domains. Layer 2 is where most operational AI governance work lives, and it's where most current readiness gaps surface.

This layer doesn't validate any specific AI system. It makes AI manageable by creating the organizational structure that allows individual systems to be governed at all.

Layer 3: The Control Layer — Makes AI defensible

This is where validation lives. It's where most of my published work concentrates and where most practitioner attention belongs. Layer 3 is the layer that translates governance commitments from Layer 2 into specific, inspection-ready evidence for specific systems.

The components: validation methodology, evaluation design, error taxonomies, HITL and HOTL controls, continuous monitoring, change control. Each system in scope under Layer 2 needs Layer 3 work executed against it.

The execution methodology I use for Layer 3 is the Britt Probabilistic Validation Lifecycle, a seven-step process specifically designed for LLMs, generative AI, and agentic systems:

Context of Use — Define what the AI does, where it sits in the GxP process, and what decision it informs.
Risk Tiering — Grade by influence on patient safety, product quality, and data integrity. Match rigor to consequence.
Evaluation Design — Build the test set: ground truth, edge cases, adversarial inputs. Define metrics per error class.
Acceptance Criteria — Set thresholds with confidence intervals. Pre-define what "good enough" means before testing.
HITL/HOTL Controls — Specify where humans review, override, or are simply informed. Document the handoff.
Continuous Monitoring — Track drift, calibration, and error patterns in production. Validation is no longer one-and-done.
Change Control — Define triggers for revalidation: model updates, data drift, scope expansion, vendor changes.

Where the GAMP AI Guide opens the door for AI validation in regulated environments, the Britt Probabilistic Validation Lifecycle defines what's on the other side. Where the House of AI Trust gives you the architecture, the Lifecycle gives you the execution.

This layer is what makes any specific AI system defensible to a regulator. It's the operational center of the architecture.

Layer 4: Domain and Process Context — Makes AI useful

The control layer can produce validated AI that is technically defensible but operationally useless if it's not grounded in the domain. Layer 4 is where domain expertise — biological variability, clinical significance, therapeutic area knowledge, process understanding — shapes what the AI is being asked to do and how its outputs should be interpreted.

This is the layer most often skipped in cross-functional AI work, because it requires people who can speak both the technical language of model behavior and the scientific language of what the model is representing. A drift signal in a PK/PD model means something different than a drift signal in a deviation triage model. The acceptance criteria for an AI tool supporting biomarker classification depend on the underlying biology of the biomarker, not just the statistical performance of the model.

For probabilistic systems specifically, Layer 4 is where the interpretation of uncertainty happens. A model that produces confidence-scored outputs is only useful if someone with domain expertise can translate "87% confidence" into "is this consistent with what we know about biological variability in this context, and what's the clinical or process significance of being wrong here." Without Layer 4, Layer 3 produces statistically valid evidence that nobody knows how to act on.

This layer makes AI useful by ensuring that the validated system is solving the right problem in a way that fits the science.

Layer 5: Business ROI — Makes AI investable

The top of the architecture. Layer 5 is where validated, useful AI translates into outcomes the business can act on: cost of failure avoided, speed to market improved, competitive advantage built, trust as an asset that compounds.

Most organizations attempting AI deployment without the layers below try to start at Layer 5 — building business cases for AI before establishing the infrastructure, governance, controls, or domain context to support it. Those projects show up in the abandonment statistics. Of life sciences companies attempting AI deployment in 2025, only 22% successfully scaled, and only 9% reported significant returns. The gap is rarely the AI itself. It's the architecture beneath the AI.

This layer makes AI investable. Without the four layers below, Layer 5 is speculation. With them, it's the natural output of the system.

The four cross-cutting threads

The five layers describe the vertical structure. Four capabilities run horizontally through every layer and every step of the Lifecycle, and most organizations don't yet have any of them in place.

Security. Prompt injection testing across LLM-facing interfaces, output sanitization to prevent data leakage, agent authentication and action-scope lockdown for agentic systems, adversarial input testing per model type and context of use. AI-specific security is not the same as IT security; ISO 27001 and SOC 2 don't address it.

Explainability. Chain-of-thought logging for LLM reasoning in regulated decisions, confidence calibration verification (does 90% confidence actually mean 90% accuracy?), attribution traceability so each output can be traced to its source data, decision-path auditability across multi-agent handoffs.

Communication. Uncertainty indicators visible to end users at the point of AI output, plain-language limitation disclosures for non-technical stakeholders, regulatory-facing transparency documentation for inspectors, patient-facing communication where AI influences treatment decisions.

Supplier qualification. LLM vendor qualification beyond standard GxP supplier audits, change notification SLAs so the sponsor knows before upstream model updates, model supply chain integrity, API dependency mapping. The supplier qualification questions for AI vendors don't map cleanly onto traditional GAMP supplier audits, and the gap is now an inspection risk.

These threads aren't optional layers an organization can defer. They're capabilities that have to exist for any of the five layers to function defensibly.

The transition to the next phase

The architecture is now in place. The next phase of this work stress-tests it against increasingly complex systems — agentic workflows, multimodal models, full validation case studies grounded in real deployments. The methodological work continues in parallel: an error taxonomy for probabilistic systems, an examination of harnesses as the control substrate.

We must stop thinking of the harness as the environment where testing happens, and start thinking of it as the engine where determinism lives. You test the composite system through the harness pre-deployment, and you monitor the composite system via the harness in production. It is the same code, performing two functions: runtime control and validation assurance as the operational substrate for Layer 3, and human performance qualification for HITL/HOTL controls.

The underlying claim is what the architecture exists to support. In regulated science, the question is never simply whether the AI works. It's whether the sponsor can stand behind the evidence the AI produces — and trust, the kind that survives an inspection and a Phase III readout, isn't a feeling. It's an architecture.

References:

Britt, K. (2026, March 14). Why AI governance fails without a control layer: A house-of-trust model for regulated drug development. Britt Biocomputing | Fit-for-Purpose AI Validation. Retrieved March 17, 2026, from https://www.kaylabritt.com/blog-1-1/why-ai-governance-fails-without-a-control-layer-a-house-of-trust-model-for-regulated-drug-development

Council for International Organizations of Medical Sciences (CIOMS). (2025). Artificial intelligence in pharmacovigilance (CIOMS Working Group XIV final report). CIOMS. Retrieved March 17, 2026, from https://cioms.ch/artificial-intelligence-inpv/ (Free PDF download requires registration/login; DOI resolves to the same page: https://doi.org/10.56759/cdob6397.)

Council for International Organizations of Medical Sciences (CIOMS). (2025, December 4). Artificial intelligence in pharmacovigilance (CIOMS Working Group XIV final report) [PDF]. Retrieved March 17, 2026, from https://dtoepidemiologia.wordpress.com/wp-content/uploads/2025/12/cioms-2025-dic-web_2105_cioms_artificial-intelligence_in_pharmacovigilance_report_en_20251204.pdf (Third-party-hosted PDF copy; official CIOMS distribution is via registration at https://cioms.ch/artificial-intelligence-inpv/.)

European Medicines Agency. (2024, September 9). Reflection paper on the use of artificial intelligence (AI) in the medicinal product lifecycle [PDF]. Retrieved March 17, 2026, from https://www.ema.europa.eu/en/documents/scientific-guideline/reflection-paper-use-artificial-intelligence-ai-medicinal-product-lifecycle_en.pdf

European Medicines Agency. (2026, January 14). EMA and FDA set common principles for AI in medicine development. Retrieved March 17, 2026, from https://www.ema.europa.eu/en/news/ema-fda-set-common-principles-ai-medicine-development-0

European Parliament & Council of the European Union. (2024, July 12). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) (OJ L, 2024/1689) [PDF]. Retrieved March 17, 2026, from https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng/pdf

International Society for Pharmaceutical Engineering (ISPE). (2025, July 14). ISPE GAMP® Guide: Artificial intelligence (guidance document listing entry). Retrieved March 17, 2026, from https://ispe.org/topics/guidance-documents

International Society for Pharmaceutical Engineering (ISPE). (2025, July). ISPE GAMP® Guide: Artificial intelligence (product page). Retrieved March 17, 2026, from https://ispe.org/publications/guidance-documents/gamp-guide-artificial-intelligence

International Society for Pharmaceutical Engineering (ISPE). (2025). GAMP® guide: Artificial intelligence—Table of contents [PDF]. Retrieved March 17, 2026, from https://ispe.org/sites/default/files/publications/guidance-documents/2025-TOC/GAMP_Guide-AI_TOC.pdf (Public TOC; full guide text paywalled.)

National Institute of Standards and Technology. (2023, January). Artificial intelligence risk management framework (AI RMF 1.0) (NIST AI 100-1) [PDF]. U.S. Department of Commerce. Retrieved March 17, 2026, from https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf

OpenAI. (2025, December 15). FrontierScience: Evaluating AI’s ability to perform scientific research tasks [PDF]. Retrieved March 17, 2026, from https://cdn.openai.com/pdf/2fcd284c-b468-4c21-8ee0-7a783933efcc/frontierscience-paper.pdf

OpenAI. (2025, December 16). Evaluating AI’s ability to perform scientific research tasks. Retrieved March 17, 2026, from https://openai.com/index/frontierscience/

Pistoia Alliance. (2025, December 3). Pistoia Alliance research finds 1 in 4 life sciences professionals do not know what data their AI models use. Retrieved March 17, 2026, from https://pistoiaalliance.org/news/1-in-4-life-sciences-professionals-dont-know-what-data-their-ai-models-use/

U.S. Food and Drug Administration. (2025, January 6). Considerations for the use of artificial intelligence to support regulatory decision-making for drug and biological products (draft guidance webpage). Retrieved March 17, 2026, from https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological

U.S. Food and Drug Administration. (2025, January). Considerations for the use of artificial intelligence to support regulatory decision-making for drug and biological products: Guidance for industry and other interested parties (draft guidance) [PDF]. Retrieved March 17, 2026, from https://www.fda.gov/media/184830/download (Document version is “draft guidance”; posting date is on the FDA guidance webpage.)

U.S. Food and Drug Administration. (2025, January 7). Considerations for the use of artificial intelligence to support regulatory decision-making for drug and biological products; availability (notice). Federal Register. Retrieved March 17, 2026, from https://www.federalregister.gov/documents/2025/01/07/2024-31542/considerations-for-the-use-of-artificial-intelligence-to-support-regulatory-decision-making-for-drug

U.S. Food and Drug Administration. (2025, June 2). FDA launches agency-wide AI tool to optimize performance for the American people (Elsa) (press announcement). Retrieved March 17, 2026, from https://www.fda.gov/news-events/press-announcements/fda-launches-agency-wide-ai-tool-optimize-performance-american-people

U.S. Food and Drug Administration. (2025, December 1). FDA expands artificial intelligence capabilities with agentic AI deployment (press announcement). Retrieved March 17, 2026, from https://www.fda.gov/news-events/press-announcements/fda-expands-artificial-intelligence-capabilities-agentic-ai-deployment

U.S. Food and Drug Administration, & European Medicines Agency. (2026, January 14). Guiding principles of good AI practice in drug development [PDF]. Retrieved March 17, 2026, from https://www.fda.gov/media/189581/download (Primary PDF; “content current as of 01/14/2026” appears on the FDA landing page.)

U.S. Food and Drug Administration. (2026, January 14). Guiding principles of good AI practice in drug development (landing page). Retrieved March 17, 2026, from https://www.fda.gov/about-fda/artificial-intelligence-drug-development/guiding-principles-good-ai-practice-drug-development

Kayla Britt