Department of Biomedical Data Science, Stanford University, Stanford, California.
Department of Radiology, Stanford University, Stanford, California.
JAMA Netw Open. 2019 Aug 2;2(8):e198719. doi: 10.1001/jamanetworkopen.2019.8719.
Pulmonary embolism (PE) is a life-threatening clinical problem, and computed tomographic imaging is the standard for diagnosis. Clinical decision support rules based on PE risk-scoring models have been developed to compute pretest probability but are underused and tend to underperform in practice, leading to persistent overuse of CT imaging for PE.
To develop a machine learning model to generate a patient-specific risk score for PE by analyzing longitudinal clinical data as clinical decision support for patients referred for CT imaging for PE.
DESIGN, SETTING, AND PARTICIPANTS: In this diagnostic study, the proposed workflow for the machine learning model, the Pulmonary Embolism Result Forecast Model (PERFORM), transforms raw electronic medical record (EMR) data into temporal feature vectors and develops a decision analytical model targeted toward adult patients referred for CT imaging for PE. The model was tested on holdout patient EMR data from 2 large, academic medical practices. A total of 3397 annotated CT imaging examinations for PE from 3214 unique patients seen at Stanford University hospitals and clinics were used for training and validation. The models were externally validated on 240 unique patients seen at Duke University Medical Center. The comparison with clinical scoring systems was done on randomly selected 100 outpatient samples from Stanford University hospitals and clinics and 101 outpatient samples from Duke University Medical Center.
Prediction performance of diagnosing acute PE was evaluated using ElasticNet, artificial neural networks, and other machine learning approaches on holdout data sets from both institutions, and performance of models was measured by area under the receiver operating characteristic curve (AUROC).
Of the 3214 patients included in the study, 1704 (53.0%) were women from Stanford University hospitals and clinics; mean (SD) age was 60.53 (19.43) years. The 240 patients from Duke University Medical Center used for validation included 132 women (55.0%); mean (SD) age was 70.2 (14.2) years. In the samples for clinical scoring system comparisons, the 100 outpatients from Stanford University hospitals and clinics included 67 women (67.0%); mean (SD) age was 57.74 (19.87) years, and the 101 patients from Duke University Medical Center included 59 women (58.4%); mean (SD) age was 73.06 (15.3) years. The best-performing model achieved an AUROC performance of predicting a positive PE study of 0.90 (95% CI, 0.87-0.91) on intrainstitutional holdout data with an AUROC of 0.71 (95% CI, 0.69-0.72) on an external data set from Duke University Medical Center; superior AUROC performance and cross-institutional generalization of the model of 0.81 (95% CI, 0.77-0.87) and 0.81 (95% CI, 0.73-0.82), respectively, were noted on holdout outpatient populations from both intrainstitutional and extrainstitutional data.
The machine learning model, PERFORM, may consider multitudes of applicable patient-specific risk factors and dependencies to arrive at a PE risk prediction that generalizes to new population distributions. This approach might be used as an automated clinical decision-support tool for patients referred for CT PE imaging to improve CT use.
肺栓塞(PE)是一种危及生命的临床问题,计算机断层扫描成像(CT)是诊断的标准。已经开发了基于 PE 风险评分模型的临床决策支持规则来计算预测概率,但在实践中使用不足且往往表现不佳,导致 CT 成像在 PE 中的过度使用仍然持续存在。
通过分析纵向临床数据,为因 PE 而接受 CT 成像检查的患者开发一种机器学习模型,以生成患者特定的 PE 风险评分,作为临床决策支持。
设计、设置和参与者:在这项诊断研究中,所提出的机器学习模型(PE 结果预测模型(PERFORM))的工作流程将原始电子病历(EMR)数据转换为时间特征向量,并为因 PE 而接受 CT 成像检查的成年患者开发决策分析模型。该模型在来自 2 家大型学术医疗实践的留院患者 EMR 数据上进行了测试。从斯坦福大学医院和诊所的 3214 名患者中共有 3397 次 CT 成像检查用于培训和验证。在杜克大学医学中心的 240 名独特患者中对模型进行了外部验证。与临床评分系统的比较是在斯坦福大学医院和诊所的 100 名门诊样本和杜克大学医学中心的 101 名门诊样本中随机选择进行的。
使用弹性网络、人工神经网络和其他机器学习方法在来自这两个机构的留院数据集上评估了诊断急性 PE 的预测性能,并通过接受者操作特征曲线下的面积(AUROC)来衡量模型的性能。
在纳入研究的 3214 名患者中,来自斯坦福大学医院和诊所的 1704 名(53.0%)为女性;平均(SD)年龄为 60.53(19.43)岁。杜克大学医学中心用于验证的 240 名患者中包括 132 名女性(55.0%);平均(SD)年龄为 70.2(14.2)岁。在用于临床评分系统比较的样本中,斯坦福大学医院和诊所的 100 名门诊患者中包括 67 名女性(67.0%);平均(SD)年龄为 57.74(19.87)岁,杜克大学医学中心的 101 名患者中包括 59 名女性(58.4%);平均(SD)年龄为 73.06(15.3)岁。表现最佳的模型在内部留院数据上的预测阳性 PE 研究的 AUROC 性能为 0.90(95%CI,0.87-0.91),在杜克大学医学中心的外部数据集上的 AUROC 性能为 0.71(95%CI,0.69-0.72);在来自内部和外部数据集的门诊留院人群中,该模型的 AUROC 性能分别为 0.81(95%CI,0.77-0.87)和 0.81(95%CI,0.73-0.82),表现出较高的 AUROC 性能和跨机构泛化能力。
机器学习模型 PERFORM 可以考虑多种适用的患者特定风险因素和依赖性,从而得出适用于新人群分布的 PE 风险预测。这种方法可以作为因 PE 而接受 CT 成像检查的患者的自动临床决策支持工具,以改善 CT 的使用。