In 2019, a landmark study published in Science revealed that an algorithm used by Optum to manage the care of roughly 200 million patients in the United States systematically discriminated against Black patients. The algorithm used healthcare spending as a proxy for health need — but because Black patients historically had less access to care and lower spending, the system scored them as healthier than equally sick white patients. At a given risk score, Black patients were considerably sicker than white patients. This scenario places you inside a similar crisis.
You are the AI governance lead at MedFirst Health System, a network of 12 hospitals across the Southeastern United States. Six months ago, MedFirst deployed TriageAI, a machine learning system that scores incoming emergency department patients on a 1-100 acuity scale. The score determines how quickly a patient is seen and what resources are allocated.
A routine internal audit has uncovered alarming findings: Black patients receive acuity scores that are on average 8.3 points lower than white patients presenting with identical symptoms and vital signs. The disparity means Black patients wait an average of 23 minutes longer to be seen. Two adverse patient outcomes in the past quarter may be linked to delayed triage. A journalist from STAT News has contacted your communications team requesting comment.
The AI system was developed by a third-party vendor, HealthScore Analytics, and trained on five years of historical patient data from MedFirst's own electronic health records. The vendor claims the model passed all standard performance benchmarks during validation.
Several governance breakdowns enabled this outcome:
No pre-deployment bias testing across protected classes. MedFirst's procurement process evaluated TriageAI on aggregate accuracy metrics (AUC, sensitivity, specificity) but never required disaggregated performance analysis across racial groups. The AIGP Body of Knowledge emphasizes that aggregate metrics can mask significant subgroup disparities — a concept known as the fairness-accuracy tradeoff.
Proxy variable contamination. The model used ZIP code, insurance type, and historical visit frequency as features. These variables are strongly correlated with race due to residential segregation and disparities in insurance coverage. The vendor did not conduct a proxy analysis or document known correlations in the model card.
Inadequate vendor due diligence. MedFirst did not require HealthScore Analytics to provide a model card, datasheets for the training data, or documentation of bias testing methodology. The vendor contract contained no provisions for algorithmic auditing or performance guarantees across demographic groups.
No ongoing monitoring for drift or disparate impact. Once deployed, no mechanism existed to continuously monitor TriageAI's scores for demographic disparities. The bias was only discovered during a scheduled internal audit — six months after deployment.
As the AI governance lead, you must now lead a cross-functional response. Here is the framework:
Immediate actions (24-48 hours):
- Suspend TriageAI and revert to the previous manual triage protocol
- Notify the Chief Medical Officer and legal counsel of the adverse findings
- Preserve all model artifacts, logs, and audit data for potential regulatory or legal proceedings
- Prepare a holding statement for the STAT News inquiry
Short-term remediation (1-4 weeks):
- Commission an independent third-party algorithmic audit (firms like ORCAA or O'Neil Risk Consulting)
- Conduct a root cause analysis on the two adverse patient outcomes
- Review all vendor contracts for algorithmic accountability provisions
- File any required incident reports with relevant state health regulators
Long-term governance improvements:
- Implement mandatory disaggregated performance testing before any clinical AI deployment
- Require model cards and bias documentation from all AI vendors
- Establish continuous fairness monitoring dashboards for deployed AI systems
- Create an AI ethics review board with clinical, technical, legal, and patient advocacy representation