Dang Vien Ngoc, Campello Víctor M, Hernández-González Jerónimo, Lekadir Karim
Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Barcelona, Spain.
Departament d'Informàtica, Matemàtica Aplicada i Estadística, Universitat de Girona, Girona, Spain.
J Healthc Inform Res. 2025 Mar 20;9(3):465-493. doi: 10.1007/s41666-025-00196-7. eCollection 2025 Sep.
Machine learning classifiers in healthcare tend to reproduce or exacerbate existing health disparities due to inherent biases in training data. This relevant issue has brought the attention of researchers in both healthcare and other domains, proposing techniques that deal with it in different stages of the machine learning process. Post-processing methods adjust model predictions to ensure fairness without interfering in the learning process nor requiring access to the original training data, preserving privacy and enabling the application to any trained model. This study rigorously compares state-of-the-art debiasing methods within the family of post-processing techniques across a wide range of synthetic and real-world (healthcare) datasets, by means of different performance and fairness metrics. Our experiments reveal the strengths and weaknesses of each method, examining the trade-offs between group fairness and predictive performance, as well as among different notions of group fairness. Additionally, we analyze the impact on untreated attributes to ensure overall bias mitigation. Our comprehensive evaluation provides insights into how these debiasing methods can be optimally implemented in healthcare settings to balance accuracy and fairness.
The online version contains supplementary material available at 10.1007/s41666-025-00196-7.
由于训练数据中存在固有偏差,医疗保健领域的机器学习分类器往往会重现或加剧现有的健康差距。这一相关问题引起了医疗保健和其他领域研究人员的关注,他们提出了在机器学习过程的不同阶段处理该问题的技术。后处理方法调整模型预测以确保公平性,而不会干扰学习过程,也无需访问原始训练数据,从而保护隐私并使该方法能够应用于任何经过训练的模型。本研究通过不同的性能和公平性指标,在广泛的合成数据集和真实世界(医疗保健)数据集中,对后处理技术家族中的先进去偏方法进行了严格比较。我们的实验揭示了每种方法的优缺点,研究了群体公平性与预测性能之间以及不同群体公平性概念之间的权衡。此外,我们分析了对未处理属性的影响,以确保总体偏差得到缓解。我们的综合评估提供了关于如何在医疗保健环境中最佳实施这些去偏方法以平衡准确性和公平性的见解。
在线版本包含可在10.1007/s41666-025-00196-7获取的补充材料。