Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut.
VA Connecticut Healthcare System, West Haven.
JAMA Netw Open. 2024 Nov 4;7(11):e2443925. doi: 10.1001/jamanetworkopen.2024.43925.
Serial functional status assessments are critical to heart failure (HF) management but are often described narratively in documentation, limiting their use in quality improvement or patient selection for clinical trials.
To develop and validate a deep learning natural language processing (NLP) strategy for extracting functional status assessments from unstructured clinical documentation.
DESIGN, SETTING, AND PARTICIPANTS: This diagnostic study used electronic health record data collected from January 1, 2013, through June 30, 2022, from patients diagnosed with HF seeking outpatient care within 3 large practice networks in Connecticut (Yale New Haven Hospital [YNHH], Northeast Medical Group [NMG], and Greenwich Hospital [GH]). Expert-annotated notes were used for NLP model development and validation. Data were analyzed from February to April 2024.
Development and validation of NLP models to detect explicit New York Heart Association (NYHA) classification, HF symptoms during activity or rest, and frequency of functional status assessments.
Outcomes of interest were model performance metrics, including area under the receiver operating characteristic curve (AUROC), and frequency of NYHA class documentation and HF symptom descriptions in unannotated notes.
This study included 34 070 patients with HF (mean [SD] age 76.1 [12.6] years; 17 728 [52.0]% female). Among 3000 expert-annotated notes (2000 from YNHH and 500 each from NMG and GH), 374 notes (12.4%) mentioned NYHA class and 1190 notes (39.7%) described HF symptoms. The NYHA class detection model achieved a class-weighted AUROC of 0.99 (95% CI, 0.98-1.00) at YNHH, the development site. At the 2 validation sites, NMG and GH, the model achieved class-weighted AUROCs of 0.98 (95% CI, 0.96-1.00) and 0.98 (95% CI, 0.92-1.00), respectively. The model for detecting activity- or rest-related symptoms achieved an AUROC of 0.94 (95% CI, 0.89-0.98) at YNHH, 0.94 (95% CI, 0.91-0.97) at NMG, and 0.95 (95% CI, 0.92-0.99) at GH. Deploying the NYHA model among 182 308 unannotated notes from the 3 sites identified 23 830 (13.1%) notes with NYHA mentions, specifically 10 913 notes (6.0%) with class I, 12 034 notes (6.6%) with classes II or III, and 883 notes (0.5%) with class IV. An additional 19 730 encounters (10.8%) could be classified into functional status groups based on activity- or rest-related symptoms, resulting in a total of 43 560 medical notes (23.9%) categorized by NYHA, an 83% increase compared with explicit mentions alone.
In this diagnostic study of 34 070 patients with HF, the NLP approach accurately extracted a patient's NYHA symptom class and activity- or rest-related HF symptoms from clinical notes, enhancing the ability to track optimal care delivery and identify patients eligible for clinical trial participation from unstructured documentation.
对心力衰竭(HF)进行连续的功能状态评估对管理至关重要,但这些评估通常在文档中以叙述性方式描述,限制了它们在质量改进或临床试验患者选择中的应用。
开发和验证一种深度学习自然语言处理(NLP)策略,用于从非结构化临床文档中提取功能状态评估。
设计、设置和参与者:这项诊断研究使用了 2013 年 1 月 1 日至 2022 年 6 月 30 日期间从康涅狄格州的 3 个大型实践网络(耶鲁纽黑文医院[YNHH]、东北医疗集团[NMG]和格林威治医院[GH])就诊的 HF 门诊患者的电子健康记录数据。专家长注释用于 NLP 模型开发和验证。数据分析于 2024 年 2 月至 4 月进行。
开发和验证 NLP 模型以检测明确的纽约心脏协会(NYHA)分类、活动或休息时的 HF 症状以及功能状态评估的频率。
感兴趣的结果是模型性能指标,包括接受者操作特征曲线(AUROC)下的面积,以及未注释记录中 NYHA 分类文档和 HF 症状描述的频率。
这项研究包括 34070 名 HF 患者(平均[标准差]年龄 76.1[12.6]岁;17728[52.0]%为女性)。在 3000 份专家注释记录(2000 份来自 YNHH,500 份分别来自 NMG 和 GH)中,有 374 份记录(12.4%)提到了 NYHA 分类,有 1190 份记录(39.7%)描述了 HF 症状。NYHA 分类检测模型在 YNHH(开发地点)实现了加权 AUROC 为 0.99(95%CI,0.98-1.00)。在 2 个验证地点,NMG 和 GH,该模型实现了加权 AUROCs 为 0.98(95%CI,0.96-1.00)和 0.98(95%CI,0.92-1.00)。用于检测活动或休息相关症状的模型在 YNHH 实现了 AUROC 为 0.94(95%CI,0.89-0.98),在 NMG 实现了 AUROC 为 0.94(95%CI,0.91-0.97),在 GH 实现了 AUROC 为 0.95(95%CI,0.92-0.99)。在来自 3 个地点的 182308 份未注释记录中部署 NYHA 模型,确定了 23830 份(13.1%)记录有 NYHA 提及,特别是 10913 份(6.0%)记录为 I 级,12034 份(6.6%)记录为 II 级或 III 级,883 份(0.5%)记录为 IV 级。另外 19730 次就诊(10.8%)可以根据活动或休息相关症状分为功能状态组,总共有 43560 份医疗记录(23.9%)根据 NYHA 分类,比仅明确提及增加了 83%。
在这项对 34070 名 HF 患者的诊断研究中,NLP 方法准确地从临床记录中提取了患者的 NYHA 症状类别和活动或休息相关的 HF 症状,增强了跟踪最佳护理提供和识别符合临床试验参与条件的患者的能力,这些患者来自非结构化的文档。