Cao Rui, Liu Yanan, Wen Xin, Liao Caiqing, Wang Xin, Gao Yuan, Tan Tao
School of Software, Taiyuan University of Technology, Taiyuan 030024, China.
Department of Radiology, Netherlands Cancer Institute (NKI), Plesmanlaan 121, Amsterdam 1066 CX, the Netherlands.
iScience. 2024 Apr 10;27(5):109712. doi: 10.1016/j.isci.2024.109712. eCollection 2024 May 17.
There are concerns that artificial intelligence (AI) algorithms may create underdiagnosis bias by mislabeling patient individuals with certain attributes (e.g., female and young) as healthy. Addressing this bias is crucial given the urgent need for AI diagnostics facing rapidly spreading infectious diseases like COVID-19. We find the prevalent AI diagnostic models show an underdiagnosis rate among specific patient populations, and the underdiagnosis rate is higher in some intersectional specific patient populations (for example, females aged 20-40 years). Additionally, we find training AI models on heterogeneous datasets (positive and negative samples from different datasets) may lead to poor model generalization. The model's classification performance varies significantly across test sets, with the accuracy of the better performance being over 40% higher than that of the poor performance. In conclusion, we developed an AI bias analysis pipeline to help researchers recognize and address biases that impact medical equality and ethics.
有人担心人工智能(AI)算法可能会通过将具有某些特征(例如女性和年轻人)的患者个体错误标记为健康而产生漏诊偏差。鉴于面对像COVID-19这样迅速传播的传染病,对AI诊断有着迫切需求,解决这种偏差至关重要。我们发现流行的AI诊断模型在特定患者群体中显示出漏诊率,并且在一些交叉特定患者群体(例如20至40岁的女性)中漏诊率更高。此外,我们发现基于异构数据集(来自不同数据集的正样本和负样本)训练AI模型可能会导致模型泛化能力差。该模型的分类性能在不同测试集之间有显著差异,性能较好的准确率比性能较差的高出40%以上。总之,我们开发了一个AI偏差分析管道,以帮助研究人员识别和解决影响医疗平等和伦理的偏差。