Suppr超能文献

一种基于索赔数据的机器学习算法,用于识别肺动脉高压患者。

A claims-based, machine-learning algorithm to identify patients with pulmonary arterial hypertension.

作者信息

Hyde Bethany, Paoli Carly J, Panjabi Sumeet, Bettencourt Katherine C, Bell Lynum Karimah S, Selej Mona

机构信息

Janssen Business Technology Commercial Data Insights & Data Science Titusville New Jersey USA.

Janssen Scientific Affairs, Inc. Titusville New Jersey USA.

出版信息

Pulm Circ. 2023 Jun 6;13(2):e12237. doi: 10.1002/pul2.12237. eCollection 2023 Apr.

Abstract

Many patients with pulmonary arterial hypertension (PAH) experience substantial delays in diagnosis, which is associated with worse outcomes and higher costs. Tools for diagnosing PAH sooner may lead to earlier treatment, which may delay disease progression and adverse outcomes including hospitalization and death. We developed a machine-learning (ML) algorithm to identify patients at risk for PAH earlier in their symptom journey and distinguish them from patients with similar early symptoms not at risk for developing PAH. Our supervised ML model analyzed retrospective, de-identified data from the US-based Optum® Clinformatics® Data Mart claims database (January 2015 to December 2019). Propensity score matched PAH and non-PAH (control) cohorts were established based on observed differences. Random forest models were used to classify patients as PAH or non-PAH at diagnosis and at 6 months prediagnosis. The PAH and non-PAH cohorts included 1339 and 4222 patients, respectively. At 6 months prediagnosis, the model performed well in distinguishing PAH and non-PAH patients, with area under the curve of the receiver operating characteristic of 0.84, recall (sensitivity) of 0.73, and precision of 0.50. Key features distinguishing PAH from non-PAH cohorts were a longer time between first symptom and the prediagnosis model date (i.e., 6 months before diagnosis); more diagnostic and prescription claims, circulatory claims, and imaging procedures, leading to higher overall healthcare resource utilization; and more hospitalizations. Our model distinguishes between patients with and without PAH at 6 months before diagnosis and illustrates the feasibility of using routine claims data to identify patients at a population level who might benefit from PAH-specific screening and/or earlier specialist referral.

摘要

许多肺动脉高压(PAH)患者在诊断方面存在显著延迟,这与更差的预后和更高的成本相关。更早诊断PAH的工具可能会带来更早的治疗,从而可能延缓疾病进展以及包括住院和死亡在内的不良后果。我们开发了一种机器学习(ML)算法,以在症状出现过程中更早地识别有PAH风险的患者,并将他们与有类似早期症状但无PAH发病风险的患者区分开来。我们的监督式ML模型分析了来自美国Optum® Clinformatics®数据集市索赔数据库(2015年1月至2019年12月)的回顾性、去识别化数据。基于观察到的差异建立了倾向评分匹配的PAH和非PAH(对照)队列。随机森林模型用于在诊断时和诊断前6个月将患者分类为PAH或非PAH。PAH和非PAH队列分别包括1339例和4222例患者。在诊断前6个月,该模型在区分PAH和非PAH患者方面表现良好,受试者操作特征曲线下面积为0.84,召回率(敏感性)为0.73,精确率为0.50。区分PAH和非PAH队列的关键特征是从首次症状出现到诊断前模型日期(即诊断前6个月)的时间更长;更多的诊断和处方索赔、循环系统索赔以及影像检查程序,导致更高的总体医疗资源利用率;以及更多的住院治疗。我们的模型在诊断前6个月就能区分有无PAH的患者,并说明了使用常规索赔数据在人群层面识别可能从PAH特异性筛查和/或更早的专科转诊中受益的患者的可行性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验