Suppr超能文献

比较改善最差预测模型性能的方法在患者亚群上的应用。

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations.

机构信息

Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305, USA.

Department of Computer Science, University of Toronto, Toronto, ON, Canada.

出版信息

Sci Rep. 2022 Feb 28;12(1):3254. doi: 10.1038/s41598-022-07167-7.

Abstract

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.

摘要

对于在患者群体中平均表现准确的临床结果预测模型,对于某些亚组可能会出现严重的性能下降,从而可能导致护理机会和质量方面的不公平现象。旨在最大化子群体之间最坏情况下模型性能的模型训练方法,例如分布鲁棒优化(DRO),试图在不引入额外危害的情况下解决此问题。我们进行了一项大规模的实证研究,研究了 DRO 和几种标准学习过程的变体,以确定与从电子健康记录数据中学习预测模型的标准方法相比,能够始终如一地提高亚组的细分和最坏情况下性能的模型开发和选择方法。在评估过程中,我们引入了一种扩展的 DRO 方法,该方法允许指定用于评估最坏情况下性能的度量标准。我们针对预测住院患者住院期间死亡率、住院时间延长和 30 天再入院率的模型以及使用重症监护数据预测住院患者死亡率的模型进行了分析。我们发现,除了极少数例外情况,对于每个检查的患者亚组,没有一种方法的表现优于使用整个训练数据集的标准学习过程。这些结果意味着,当需要在标准实践基础上进一步提高患者亚组的模型性能时,可能需要通过增加有效样本量或降低预测问题中的噪声水平的技术来实现。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验