Hossein Abad Zahra Shakeri, Kline Adrienne, Lee Joon
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5446-5449. doi: 10.1109/EMBC44109.2020.9176622.
Given the extensive use of machine learning in patient outcome prediction, and the understanding that the challenging nature of predictions in this field may considerably modify the performance of predictive models, research in this area requires some forms of context-sensitive performance metrics. The area under the receiver operating characteristic curve (AUC), precision, recall, specificity, and F1 are widely used measures of performance for patient outcome prediction. These metrics have several merits: they are easy to interpret and do not need any subjective input from the user. However, they weight all samples equally and do not adequately reflect the ability of predictive models in classifying difficult samples. In this paper, we propose the Difficulty Weight Adjustment (DWA) algorithm, a simple method that incorporates the difficulty level of samples when evaluating predictive models. Using a large dataset of 139,367 unique ICU admissions within the eICU Collaborative Research Database (eICU-CRD), we show that the classification difficulty and the discrimination ability of samples are critical aspects that need to be considered when comparing machine learning models that predict patient outcomes.
鉴于机器学习在患者预后预测中的广泛应用,以及认识到该领域预测的挑战性本质可能会显著改变预测模型的性能,该领域的研究需要一些形式的上下文敏感性能指标。受试者工作特征曲线下面积(AUC)、精确率、召回率、特异性和F1是广泛用于患者预后预测的性能度量。这些指标有几个优点:它们易于解释,不需要用户的任何主观输入。然而,它们对所有样本一视同仁,不能充分反映预测模型对困难样本的分类能力。在本文中,我们提出了难度权重调整(DWA)算法,这是一种在评估预测模型时纳入样本难度水平的简单方法。使用电子重症监护病房协作研究数据库(eICU-CRD)中139367例独特的重症监护病房入院患者的大型数据集,我们表明,在比较预测患者预后的机器学习模型时,样本的分类难度和区分能力是需要考虑的关键方面。