Zheng Chengyi, Sun Benjamin C, Wu Yi-Lin, Ferencik Maros, Lee Ming-Sum, Redberg Rita F, Kawatkar Aniket A, Musigdilok Visanee V, Sharp Adam L
Research and Evaluation Department, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA 91101, USA.
Department of Emergency Medicine and Leonard Davis Institute, University of Pennsylvania, Philadelphia, PA 19104, USA.
Eur Heart J Digit Health. 2022 Sep 5;3(4):626-637. doi: 10.1093/ehjdh/ztac047. eCollection 2022 Dec.
Stress echocardiography (SE) findings and interpretations are commonly documented in free-text reports. Reusing SE results requires laborious manual reviews. This study aimed to develop and validate an automated method for abstracting SE reports in a large cohort.
This study included adult patients who had SE within 30 days of their emergency department visit for suspected acute coronary syndrome in a large integrated healthcare system. An automated natural language processing (NLP) algorithm was developed to abstract SE reports and classify overall SE results into normal, non-diagnostic, infarction, and ischaemia categories. Randomly selected reports ( = 140) were double-blindly reviewed by cardiologists to perform criterion validity of the NLP algorithm. Construct validity was tested on the entire cohort using abstracted SE data and additional clinical variables. The NLP algorithm abstracted 6346 consecutive SE reports. Cardiologists had good agreements on the overall SE results on the 140 reports: Kappa (0.83) and intraclass correlation coefficient (0.89). The NLP algorithm achieved 98.6% specificity and negative predictive value, 95.7% sensitivity, positive predictive value, and -score on the overall SE results and near-perfect scores on ischaemia findings. The 30-day acute myocardial infarction or death outcomes were highest among patients with ischaemia (5.0%), followed by infarction (1.4%), non-diagnostic (0.8%), and normal (0.3%) results. We found substantial variations in the format and quality of SE reports, even within the same institution.
Natural language processing is an accurate and efficient method for abstracting unstructured SE reports. This approach creates new opportunities for research, public health measures, and care improvement.
负荷超声心动图(SE)的检查结果及解读通常记录在自由文本报告中。重复使用SE结果需要费力的人工审核。本研究旨在开发并验证一种用于提取大型队列中SE报告的自动化方法。
本研究纳入了在一个大型综合医疗系统中因疑似急性冠脉综合征而在急诊科就诊后30天内接受SE检查的成年患者。开发了一种自动化自然语言处理(NLP)算法来提取SE报告,并将整体SE结果分类为正常、非诊断性、梗死和缺血类别。心脏病专家对随机选择的报告(n = 140)进行双盲审核,以评估NLP算法的标准效度。使用提取的SE数据和其他临床变量在整个队列中测试结构效度。NLP算法提取了6346份连续的SE报告。心脏病专家对140份报告的整体SE结果达成了良好的一致性:kappa值(0.83)和组内相关系数(0.89)。NLP算法在整体SE结果方面的特异性和阴性预测值达到98.6%,敏感性、阳性预测值和F分数达到95.7%,在缺血性发现方面的分数接近完美。缺血性患者30天内急性心肌梗死或死亡的结局发生率最高(5.0%),其次是梗死(1.4%)、非诊断性(0.8%)和正常(0.3%)结果。我们发现,即使在同一机构内,SE报告的格式和质量也存在很大差异。
自然语言处理是提取非结构化SE报告的一种准确且高效的方法。这种方法为研究、公共卫生措施和医疗改善创造了新机会。