Johnson Stacy A, Signor Emily A, Lappe Katie L, Shi Jianlin, Jenkins Stephen L, Wikstrom Sara W, Kroencke Rachel D, Hallowell David, Jones Aubrey E, Witt Daniel M
Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America.
Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America.
Thromb Res. 2021 Jul;203:190-195. doi: 10.1016/j.thromres.2021.04.020. Epub 2021 May 6.
The 10th revision of the International Classification of Diseases (ICD-10) codes is frequently used to identify pulmonary embolism (PE) events, although the validity of ICD-10 has been questioned. Natural language processing (NLP) is a novel tool that may be useful for pulmonary embolism identification.
We performed a retrospective comparative accuracy study of 1000 randomly selected healthcare encounters with a CT pulmonary angiogram ordered between January 1, 2019 and January 1, 2020 at a single academic medical center. Two independent observers reviewed each radiology report and abstracted key findings related to PE presence/absence, chronicity, and anatomic location. NLP interpretations of radiology reports and ICD-10 codes were queried electronically and compared to the reference standard, manual chart review.
A total of 970 encounters were included for analysis. The prevalence of PE was 13% by manual review. For PE identification, sensitivity was similar between NLP (96.0%) and ICD-10 (92.9%; p = 0.405), and specificity was significantly higher with NLP (97.7%) compared to ICD-10 (91.0%; p < 0.001). NLP demonstrated higher sensitivity (70.0% vs 16.5%, p < 0.001) and specificity (99.9% vs 99.4%, p = 0.014) for saddle/main PE recognition, and significantly higher sensitivity (86.7% vs 8.3%, p < 0.001) and specificity (99.8% vs 96.5%, p < 0.001) for subsegmental PE compared to ICD-10.
NLP is highly sensitive for PE identification and more specific than ICD-10 coding. NLP outperformed ICD-10 coding for recognition of subsegmental, saddle, and chronic PE. Our results suggest NLP is an efficient and more reliable method than ICD-10 for PE identification and characterization.
国际疾病分类第十版(ICD - 10)编码常用于识别肺栓塞(PE)事件,尽管ICD - 10的有效性受到质疑。自然语言处理(NLP)是一种新型工具,可能有助于肺栓塞的识别。
我们在一个学术医疗中心进行了一项回顾性比较准确性研究,对2019年1月1日至2020年1月1日期间随机选择的1000例进行了CT肺动脉造影的医疗就诊病例进行研究。两名独立观察者审查每份放射学报告,并提取与PE存在与否、慢性情况和解剖位置相关的关键发现。通过电子方式查询放射学报告的NLP解读和ICD - 10编码,并与参考标准(人工病历审查)进行比较。
共纳入970例就诊病例进行分析。经人工审查,PE的患病率为13%。对于PE识别,NLP的敏感性(96.0%)与ICD - 10(92.9%;p = 0.405)相似,且NLP的特异性(97.7%)显著高于ICD - 10(91.0%;p < 0.001)。对于鞍型/主要PE识别,NLP显示出更高的敏感性(70.0%对16.5%,p < 0.001)和特异性(99.9%对99.4%,p = 0.014),与ICD - 10相比,亚段PE的敏感性(86.7%对8.3%,p < 0.001)和特异性(99.8%对96.5%,p < 0.001)显著更高。
NLP对PE识别高度敏感,且比ICD - 10编码更具特异性。在识别亚段、鞍型和慢性PE方面,NLP优于ICD - 10编码。我们的结果表明,与ICD - 10相比,NLP是一种用于PE识别和特征描述的高效且更可靠的方法。