比较改善最差预测模型性能的方法在患者亚群上的应用。

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations.

机构信息

Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305, USA.

Department of Computer Science, University of Toronto, Toronto, ON, Canada.

出版信息

Sci Rep. 2022 Feb 28;12(1):3254. doi: 10.1038/s41598-022-07167-7.

DOI:10.1038/s41598-022-07167-7

PMID:35228563

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8885701/

Abstract

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.

摘要

对于在患者群体中平均表现准确的临床结果预测模型，对于某些亚组可能会出现严重的性能下降，从而可能导致护理机会和质量方面的不公平现象。旨在最大化子群体之间最坏情况下模型性能的模型训练方法，例如分布鲁棒优化（DRO），试图在不引入额外危害的情况下解决此问题。我们进行了一项大规模的实证研究，研究了 DRO 和几种标准学习过程的变体，以确定与从电子健康记录数据中学习预测模型的标准方法相比，能够始终如一地提高亚组的细分和最坏情况下性能的模型开发和选择方法。在评估过程中，我们引入了一种扩展的 DRO 方法，该方法允许指定用于评估最坏情况下性能的度量标准。我们针对预测住院患者住院期间死亡率、住院时间延长和 30 天再入院率的模型以及使用重症监护数据预测住院患者死亡率的模型进行了分析。我们发现，除了极少数例外情况，对于每个检查的患者亚组，没有一种方法的表现优于使用整个训练数据集的标准学习过程。这些结果意味着，当需要在标准实践基础上进一步提高患者亚组的模型性能时，可能需要通过增加有效样本量或降低预测问题中的噪声水平的技术来实现。

相似文献

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations.比较改善最差预测模型性能的方法在患者亚群上的应用。

Sci Rep. 2022 Feb 28;12(1):3254. doi: 10.1038/s41598-022-07167-7.

Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records.动态可解释机器学习预测 ICU 患者死亡率：电子患者记录中高频数据的回顾性研究。

Lancet Digit Health. 2020 Apr;2(4):e179-e191. doi: 10.1016/S2589-7500(20)30018-2. Epub 2020 Mar 12.

Real-time prediction of mortality, readmission, and length of stay using electronic health record data.利用电子健康记录数据对死亡率、再入院率和住院时间进行实时预测。

J Am Med Inform Assoc. 2016 May;23(3):553-61. doi: 10.1093/jamia/ocv110. Epub 2015 Sep 15.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

A deep attention model to forecast the Length Of Stay and the in-hospital mortality right on admission from ICD codes and demographic data.基于 ICD 编码和人口统计学数据的深度注意力模型，可在入院时预测住院时间和院内死亡率。

J Biomed Inform. 2021 Jun;118:103778. doi: 10.1016/j.jbi.2021.103778. Epub 2021 Apr 17.

Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.使用树套索逻辑回归构建用于儿科医院再入院的可解释预测模型。

Artif Intell Med. 2016 Sep;72:12-21. doi: 10.1016/j.artmed.2016.07.003. Epub 2016 Jul 29.

Predicting all-cause readmissions using electronic health record data from the entire hospitalization: Model development and comparison.利用整个住院期间的电子健康记录数据预测全因再入院：模型开发与比较。

J Hosp Med. 2016 Jul;11(7):473-80. doi: 10.1002/jhm.2568. Epub 2016 Feb 29.

Learning With Noisy Labels Over Imbalanced Subpopulations.在不平衡子群体上使用噪声标签进行学习。

IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):6544-6555. doi: 10.1109/TNNLS.2024.3389676. Epub 2025 Apr 4.

Assessing the impact of social determinants of health on predictive models for potentially avoidable 30-day readmission or death.评估健康的社会决定因素对可避免的 30 天再入院或死亡预测模型的影响。

PLoS One. 2020 Jun 25;15(6):e0235064. doi: 10.1371/journal.pone.0235064. eCollection 2020.

引用本文的文献

Detecting and Remediating Harmful Data Shifts for the Responsible Deployment of Clinical AI Models.检测并纠正有害数据偏移，以实现临床人工智能模型的负责任部署。

JAMA Netw Open. 2025 Jun 2;8(6):e2513685. doi: 10.1001/jamanetworkopen.2025.13685.

A roadmap to implementing machine learning in healthcare: from concept to practice.医疗保健领域实施机器学习的路线图：从概念到实践。

Front Digit Health. 2025 Jan 20;7:1462751. doi: 10.3389/fdgth.2025.1462751. eCollection 2025.

Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.电子病历中的病理生理特征在时间数据集偏移下维持模型性能。

AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:95-104. eCollection 2024.

The path toward equal performance in medical machine learning.通往医学机器学习中平等性能的道路。

Patterns (N Y). 2023 Jul 14;4(7):100790. doi: 10.1016/j.patter.2023.100790.

Characterizing subgroup performance of probabilistic phenotype algorithms within older adults: a case study for dementia, mild cognitive impairment, and Alzheimer's and Parkinson's diseases.老年人中概率性表型算法的亚组性能特征：痴呆症、轻度认知障碍以及阿尔茨海默病和帕金森病的案例研究

JAMIA Open. 2023 Jun 28;6(2):ooad043. doi: 10.1093/jamiaopen/ooad043. eCollection 2023 Jul.

Generalizability challenges of mortality risk prediction models: A retrospective analysis on a multi-center database.死亡率风险预测模型的可推广性挑战：对多中心数据库的回顾性分析

PLOS Digit Health. 2022 Apr 5;1(4):e0000023. doi: 10.1371/journal.pdig.0000023. eCollection 2022 Apr.

本文引用的文献

Evaluating algorithmic fairness in the presence of clinical guidelines: the case of atherosclerotic cardiovascular disease risk estimation.评估临床指南存在下的算法公平性：以动脉粥样硬化性心血管疾病风险评估为例。

BMJ Health Care Inform. 2022 Apr;29(1). doi: 10.1136/bmjhci-2021-100460.

Ethical Machine Learning in Healthcare.医疗保健中的伦理机器学习。

Annu Rev Biomed Data Sci. 2021 Jul;4:123-144. doi: 10.1146/annurev-biodatasci-092820-114757. Epub 2021 May 6.

Racial/Ethnic Disparities in the Performance of Prediction Models for Death by Suicide After Mental Health Visits.精神卫生就诊后自杀死亡预测模型表现的种族/民族差异。

JAMA Psychiatry. 2021 Jul 1;78(7):726-734. doi: 10.1001/jamapsychiatry.2021.0493.

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.比较降低产后抑郁临床预测模型偏倚的方法。

JAMA Netw Open. 2021 Apr 1;4(4):e213909. doi: 10.1001/jamanetworkopen.2021.3909.

CheXclusion: Fairness gaps in deep chest X-ray classifiers.CheXclusion：深度学习胸部 X 射线分类器中的公平性差距。

Pac Symp Biocomput. 2021;26:232-243.

Minimax Pareto Fairness: A Multi Objective Perspective.极小极大帕累托公平性：多目标视角

Proc Mach Learn Res. 2020 Jul;119:6755-6764.

A framework for making predictive models useful in practice.一个使预测模型在实践中有用的框架。

J Am Med Inform Assoc. 2021 Jun 12;28(6):1149-1158. doi: 10.1093/jamia/ocaa318.

Language models are an effective representation learning technique for electronic health record data.语言模型是一种用于电子健康记录数据的有效表示学习技术。

J Biomed Inform. 2021 Jan;113:103637. doi: 10.1016/j.jbi.2020.103637. Epub 2020 Dec 5.

Addressing bias in prediction models by improving subpopulation calibration.通过改进子群体校准来解决预测模型中的偏差。

J Am Med Inform Assoc. 2021 Mar 1;28(3):549-558. doi: 10.1093/jamia/ocaa283.

An empirical characterization of fair machine learning for clinical risk prediction.用于临床风险预测的公平机器学习的实证特征描述。

J Biomed Inform. 2021 Jan;113:103621. doi: 10.1016/j.jbi.2020.103621. Epub 2020 Nov 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

比较改善最差预测模型性能的方法在患者亚群上的应用。

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献