Department of Applied Mathematics, University of California Merced, Merced, California, USA
Department of Applied Mathematics, University of California Merced, Merced, California, USA.
BMJ Health Care Inform. 2022 Apr;29(1). doi: 10.1136/bmjhci-2021-100456.
Improve methodology for equitable suicide death prediction when using sensitive predictors, such as race/ethnicity, for machine learning and statistical methods.
Train predictive models, logistic regression, naive Bayes, gradient boosting (XGBoost) and random forests, using three resampling techniques (Blind, Separate, Equity) on emergency department (ED) administrative patient records. The Blind method resamples without considering racial/ethnic group. Comparatively, the Separate method trains disjoint models for each group and the Equity method builds a training set that is balanced both by racial/ethnic group and by class.
Using the Blind method, performance range of the models' sensitivity for predicting suicide death between racial/ethnic groups (a measure of prediction inequity) was 0.47 for logistic regression, 0.37 for naive Bayes, 0.56 for XGBoost and 0.58 for random forest. By building separate models for different racial/ethnic groups or using the equity method on the training set, we decreased the range in performance to 0.16, 0.13, 0.19, 0.20 with Separate method, and 0.14, 0.12, 0.24, 0.13 for Equity method, respectively. XGBoost had the highest overall area under the curve (AUC), ranging from 0.69 to 0.79.
We increased performance equity between different racial/ethnic groups and show that imbalanced training sets lead to models with poor predictive equity. These methods have comparable AUC scores to other work in the field, using only single ED administrative record data.
We propose two methods to improve equity of suicide death prediction among different racial/ethnic groups. These methods may be applied to other sensitive characteristics to improve equity in machine learning with healthcare applications.
改进使用敏感预测因子(如种族/民族)进行机器学习和统计方法的公平自杀死亡率预测的方法。
使用三种重采样技术(盲法、分离法、公平法)在急诊部(ED)行政患者记录上训练预测模型,逻辑回归、朴素贝叶斯、梯度提升(XGBoost)和随机森林。盲法在不考虑种族/民族群体的情况下进行重采样。相比之下,分离法为每个群体训练不相交的模型,公平法构建一个通过种族/民族群体和类别都平衡的训练集。
使用盲法,模型对预测自杀死亡率的敏感性在种族/民族群体之间的表现范围(预测不公平的衡量标准)为逻辑回归 0.47,朴素贝叶斯 0.37,XGBoost 0.56,随机森林 0.58。通过为不同种族/民族群体建立单独的模型或在训练集上使用公平法,我们将性能范围分别降低到 0.16、0.13、0.19、0.20(分离法)和 0.14、0.12、0.24、0.13(公平法)。XGBoost 的总体曲线下面积(AUC)最高,范围从 0.69 到 0.79。
我们提高了不同种族/民族群体之间的性能公平性,并表明不平衡的训练集导致预测公平性差的模型。这些方法与该领域的其他工作相比,仅使用单一 ED 行政记录数据,具有可比的 AUC 评分。
我们提出了两种方法来提高不同种族/民族群体中自杀死亡率预测的公平性。这些方法可应用于其他敏感特征,以提高医疗保健应用中的机器学习中的公平性。