Makar Maggie, Ghassemi Marzyeh, Cutler David M, Obermeyer Ziad
Department of Emergency Medicine at Brigham & Women's Hospital, Boston, MA.
Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge.
Int J Mach Learn Comput. 2015 Jun;5(3):192-197. doi: 10.7763/IJMLC.2015.V5.506.
Risk prediction is central to both clinical medicine and public health. While many machine learning models have been developed to predict mortality, they are rarely applied in the clinical literature, where classification tasks typically rely on logistic regression. One reason for this is that existing machine learning models often seek to optimize predictions by incorporating features that are not present in the databases readily available to providers and policy makers, limiting generalizability and implementation. Here we tested a number of machine learning classifiers for prediction of six-month mortality in a population of elderly Medicare beneficiaries, using an administrative claims database of the kind available to the majority of health care payers and providers. We show that machine learning classifiers substantially outperform current widely-used methods of risk prediction-but only when used with an improved feature set incorporating insights from clinical medicine, developed for this study. Our work has applications to supporting patient and provider decision making at the end of life, as well as population health-oriented efforts to identify patients at high risk of poor outcomes.
风险预测是临床医学和公共卫生的核心。虽然已经开发了许多机器学习模型来预测死亡率,但它们很少应用于临床文献中,临床文献中的分类任务通常依赖于逻辑回归。原因之一是,现有的机器学习模型通常试图通过纳入医疗服务提供者和政策制定者无法轻易获取的数据库中不存在的特征来优化预测,这限制了模型的通用性和实用性。在此,我们使用大多数医疗保健支付方和提供者都可获取的那种行政索赔数据库,测试了多种机器学习分类器,以预测老年医疗保险受益人群体的六个月死亡率。我们表明,机器学习分类器的表现大大优于当前广泛使用的风险预测方法——但前提是与为本研究开发的、纳入了临床医学见解的改进特征集一起使用。我们的工作可应用于支持临终时的患者和医疗服务提供者决策,以及以人群健康为导向的、识别预后不良高风险患者的工作。