Huti Mohamed, Lee Tiarna, Sawyer Elinor, King Andrew P
School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK.
School of Cancer and Pharmaceutical Sciences, King's College London, London, UK.
Clin Image Based Proced Fairness AI Med Imaging Ethical Philos Issues Med Imaging (2023). 2023;14242:225-234. doi: 10.1007/978-3-031-45249-9_22. Epub 2023 Oct 9.
Recent research has shown that artificial intelligence (AI) models can exhibit bias in performance when trained using data that are imbalanced by protected attribute(s). Most work to date has focused on deep learning models, but classical AI techniques that make use of hand-crafted features may also be susceptible to such bias. In this paper we investigate the potential for race bias in random forest (RF) models trained using radiomics features. Our application is prediction of tumour molecular subtype from dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) of breast cancer patients. Our results show that radiomics features derived from DCE-MRI data do contain race-identifiable information, and that RF models can be trained to predict White and Black race from these data with 60-70% accuracy, depending on the subset of features used. Furthermore, RF models trained to predict tumour molecular subtype using race-imbalanced data seem to produce biased behaviour, exhibiting better performance on test data from the race on which they were trained.
最近的研究表明,当使用因受保护属性而失衡的数据进行训练时,人工智能(AI)模型在性能上可能会表现出偏差。迄今为止,大多数工作都集中在深度学习模型上,但利用手工制作特征的经典AI技术也可能容易受到这种偏差的影响。在本文中,我们研究了使用放射组学特征训练的随机森林(RF)模型中存在种族偏差的可能性。我们的应用是从乳腺癌患者的动态对比增强磁共振成像(DCE-MRI)预测肿瘤分子亚型。我们的结果表明,从DCE-MRI数据中提取的放射组学特征确实包含可识别种族的信息,并且可以训练RF模型根据这些数据预测白人和黑人种族,准确率在60%-70%之间,具体取决于所使用的特征子集。此外,使用种族失衡数据训练以预测肿瘤分子亚型的RF模型似乎会产生偏差行为,在其训练所用种族的测试数据上表现出更好的性能。