Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States.
Department of Biomedical Informatics and Health Education, University of Washington, Seattle, WA 98105, United States.
J Am Med Inform Assoc. 2024 Apr 19;31(5):1172-1183. doi: 10.1093/jamia/ocae060.
Leveraging artificial intelligence (AI) in conjunction with electronic health records (EHRs) holds transformative potential to improve healthcare. However, addressing bias in AI, which risks worsening healthcare disparities, cannot be overlooked. This study reviews methods to handle various biases in AI models developed using EHR data.
We conducted a systematic review following the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines, analyzing articles from PubMed, Web of Science, and IEEE published between January 01, 2010 and December 17, 2023. The review identified key biases, outlined strategies for detecting and mitigating bias throughout the AI model development, and analyzed metrics for bias assessment.
Of the 450 articles retrieved, 20 met our criteria, revealing 6 major bias types: algorithmic, confounding, implicit, measurement, selection, and temporal. The AI models were primarily developed for predictive tasks, yet none have been deployed in real-world healthcare settings. Five studies concentrated on the detection of implicit and algorithmic biases employing fairness metrics like statistical parity, equal opportunity, and predictive equity. Fifteen studies proposed strategies for mitigating biases, especially targeting implicit and selection biases. These strategies, evaluated through both performance and fairness metrics, predominantly involved data collection and preprocessing techniques like resampling and reweighting.
This review highlights evolving strategies to mitigate bias in EHR-based AI models, emphasizing the urgent need for both standardized and detailed reporting of the methodologies and systematic real-world testing and evaluation. Such measures are essential for gauging models' practical impact and fostering ethical AI that ensures fairness and equity in healthcare.
利用人工智能(AI)与电子健康记录(EHR)相结合具有改变医疗保健的潜力。然而,不能忽视 AI 中的偏见问题,因为这可能会加剧医疗保健差距。本研究综述了使用 EHR 数据开发的 AI 模型中各种偏见的处理方法。
我们按照系统评价和荟萃分析的首选报告项目进行了系统综述,分析了 2010 年 1 月 1 日至 2023 年 12 月 17 日期间在 PubMed、Web of Science 和 IEEE 上发表的文章。该综述确定了关键的偏见,概述了在整个 AI 模型开发过程中检测和减轻偏见的策略,并分析了评估偏见的指标。
从 450 篇文章中检索到 20 篇符合我们标准的文章,揭示了 6 种主要的偏见类型:算法、混杂、隐式、测量、选择和时间。这些 AI 模型主要用于预测任务,但尚未在实际医疗保健环境中部署。五项研究集中于使用公平性指标(如统计均等性、均等机会和预测公平性)检测隐式和算法偏见。十五项研究提出了减轻偏见的策略,特别是针对隐式和选择偏见。这些策略通过性能和公平性指标进行评估,主要涉及数据收集和预处理技术,如重采样和重新加权。
本综述强调了在基于 EHR 的 AI 模型中减轻偏见的策略,强调需要标准化和详细报告方法,并进行系统的真实世界测试和评估。这些措施对于评估模型的实际影响以及促进公平和公正的医疗保健中的伦理 AI 至关重要。