Suppr超能文献

在数据稀缺的医院环境中使用迁移学习改进死亡率预测

Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting.

作者信息

Desautels Thomas, Calvert Jacob, Hoffman Jana, Mao Qingqing, Jay Melissa, Fletcher Grant, Barton Chris, Chettipally Uli, Kerem Yaniv, Das Ritankar

机构信息

Department of Research, Dascena, Inc, Hayward, CA, USA.

Division of General Internal Medicine, University of Washington School of Medicine, Seattle, WA, USA.

出版信息

Biomed Inform Insights. 2017 Jun 12;9:1178222617712994. doi: 10.1177/1178222617712994. eCollection 2017.

Abstract

Algorithm-based clinical decision support (CDS) systems associate patient-derived health data with outcomes of interest, such as in-hospital mortality. However, the quality of such associations often depends on the availability of site-specific training data. Without sufficient quantities of data, the underlying statistical apparatus cannot differentiate useful patterns from noise and, as a result, may underperform. This initial training data burden limits the widespread, out-of-the-box, use of machine learning-based risk scoring systems. In this study, we implement a statistical transfer learning technique, which uses a large "source" data set to drastically reduce the amount of data needed to perform well on a "target" site for which training data are scarce. We test this transfer technique with , a mortality prediction algorithm, on patient charts from the Beth Israel Deaconess Medical Center (the source) and a population of 48 249 adult inpatients from University of California San Francisco Medical Center (the target institution). We find that the amount of training data required to surpass 0.80 area under the receiver operating characteristic (AUROC) on the target set decreases from more than 4000 patients to fewer than 220. This performance is superior to the Modified Early Warning Score (AUROC: 0.76) and corresponds to a decrease in clinical data collection time from approximately 6 months to less than 10 days. Our results highlight the usefulness of transfer learning in the specialization of CDS systems to new hospital sites, without requiring expensive and time-consuming data collection efforts.

摘要

基于算法的临床决策支持(CDS)系统将患者的健康数据与感兴趣的结果相关联,如住院死亡率。然而,这种关联的质量往往取决于特定地点训练数据的可用性。如果没有足够的数据量,底层的统计工具就无法区分有用的模式和噪声,结果可能表现不佳。这种初始训练数据负担限制了基于机器学习的风险评分系统的广泛、即插即用。在本研究中,我们实施了一种统计迁移学习技术,该技术使用一个大型“源”数据集,大幅减少在训练数据稀缺的“目标”地点表现良好所需的数据量。我们使用死亡率预测算法在贝斯以色列女执事医疗中心(源)的患者病历以及来自加州大学旧金山分校医疗中心(目标机构)的48249名成年住院患者群体上测试了这种迁移技术。我们发现,在目标集上使受试者工作特征曲线下面积(AUROC)超过0.80所需的训练数据量从超过4000名患者减少到少于220名。这一表现优于改良早期预警评分(AUROC:0.76),并且相当于将临床数据收集时间从大约6个月减少到不到10天。我们的结果凸显了迁移学习在将CDS系统专门应用于新医院地点方面的有用性,而无需进行昂贵且耗时的数据收集工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465c/5470861/60552667bcd9/10.1177_1178222617712994-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验