Suppr超能文献

电子病历中的病理生理特征在时间数据集偏移下维持模型性能。

Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.

作者信息

Brosula Raphael, Corbin Conor K, Chen Jonathan H

机构信息

Genomic Center for Infectious Diseases, Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Department of Computer Science, Stanford University, Stanford, CA, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:95-104. eCollection 2024.

Abstract

Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into by their source (e.g. medication orders, diagnosis codes and lab results) and based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.

摘要

获取电子病历(EMR)等真实世界数据流加速了用于临床应用的监督机器学习(ML)模型的开发。然而,很少有研究调查EMR中特定特征在时间数据集偏移下对模型性能的不同影响。为了解释EMR中的特征如何随时间影响模型,本研究根据特征的来源(如用药医嘱、诊断代码和实验室结果)以及它们对患者病理生理学或医疗过程的反映,将特征进行聚合。我们采用Shapley值来解释特征组和特征类别对初始和持续模型性能的边际贡献。我们研究了三项标准临床预测任务,发现虽然特征对初始性能的贡献因任务而异,但病理生理特征有助于减轻时间歧视恶化。这些结果为特定特征组如何对模型性能以及对时间数据集偏移的鲁棒性做出贡献提供了可解释的见解。

相似文献

1
2
An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records.
Artif Intell Med. 2015 Oct;65(2):155-66. doi: 10.1016/j.artmed.2015.04.007. Epub 2015 May 15.
6
Supervised Extraction of Diagnosis Codes from EMRs: Role of Feature Selection, Data Selection, and Probabilistic Thresholding.
Proc (IEEE Int Conf Healthc Inform). 2013 Sep;2013:66-73. doi: 10.1109/ICHI.2013.15. Epub 2013 Dec 12.
10
Automated feature selection of predictors in electronic medical records data.
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.

本文引用的文献

2
EHR foundation models improve robustness in the presence of temporal distribution shift.
Sci Rep. 2023 Mar 7;13(1):3767. doi: 10.1038/s41598-023-30820-8.
4
Open questions and research gaps for monitoring and updating AI-enabled tools in clinical settings.
Front Digit Health. 2022 Sep 2;4:958284. doi: 10.3389/fdgth.2022.958284. eCollection 2022.
7
In medicine, how do we machine learn anything real?
Patterns (N Y). 2022 Jan 14;3(1):100392. doi: 10.1016/j.patter.2021.100392.
8
The association between vital signs and clinical outcomes in emergency department patients of different age categories.
Emerg Med J. 2022 Dec;39(12):903-911. doi: 10.1136/emermed-2020-210628. Epub 2022 Jan 11.
9
Quantification of Sepsis Model Alerts in 24 US Hospitals Before and During the COVID-19 Pandemic.
JAMA Netw Open. 2021 Nov 1;4(11):e2135286. doi: 10.1001/jamanetworkopen.2021.35286.
10
Preventing dataset shift from breaking machine-learning biomarkers.
Gigascience. 2021 Sep 28;10(9). doi: 10.1093/gigascience/giab055.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验