Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT.
Department of Medicine, University of Utah School of Medicine, Salt Lake City, UT; Thrombosis Service, University of Utah Health, Salt Lake City, UT.
Surgery. 2021 Oct;170(4):1175-1182. doi: 10.1016/j.surg.2021.04.027. Epub 2021 Jun 3.
The objective of this study was to develop a portal natural language processing approach to aid in the identification of postoperative venous thromboembolism events from free-text clinical notes.
We abstracted clinical notes from 25,494 operative events from 2 independent health care systems. A venous thromboembolism detected as part of the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) was used as the reference standard. A natural language processing engine, easy clinical information extractor-pulmonary embolism/deep vein thrombosis (EasyCIE-PEDVT), was trained to detect pulmonary embolism and deep vein thrombosis from clinical notes. International Classification of Diseases (ICD) discharge diagnosis codes for venous thromboembolism were used as baseline comparators. The classification performance of EasyCIE-PEDVT was compared with International Classification of Diseases codes using sensitivity, specificity, area under the receiver operating characteristic curve, using an internal and external validation cohort.
To detect pulmonary embolism, EasyCIE-PEDVT had a sensitivity of 0.714 and 0.815 in internal and external validation, respectively. To detect deep vein thrombosis, EasyCIE-PEDVT had a sensitivity of 0.846 and 0.849 in internal and external validation, respectively. EasyCIE-PEDVT had significantly higher discrimination for deep vein thrombosis compared with International Classification of Diseases codes in internal validation (area under the receiver operating characteristic curve: 0.920 vs 0.761; P < .001) and external validation (area under the receiver operating characteristic curve: 0.921 vs 0.794; P < .001). There was no significant difference in the discrimination for pulmonary embolism between EasyCIE-PEDVT and ICD codes.
Accurate surveillance of postoperative venous thromboembolism may be achieved using natural language processing on clinical notes in 2 independent health care systems. These findings suggest natural language processing may augment manual chart abstraction for large registries such as NSQIP.
本研究旨在开发一种门户自然语言处理方法,以帮助从临床记录的自由文本中识别术后静脉血栓栓塞事件。
我们从 2 个独立的医疗保健系统中提取了 25494 个手术事件的临床记录。美国外科医师学会国家手术质量改进计划(ACS NSQIP)中检测到的静脉血栓栓塞被用作参考标准。一个名为“易临床信息提取器-肺栓塞/深静脉血栓形成(EasyCIE-PEDVT)”的自然语言处理引擎被用于从临床记录中检测肺栓塞和深静脉血栓形成。静脉血栓栓塞的国际疾病分类(ICD)出院诊断代码被用作基线比较器。使用内部和外部验证队列比较 EasyCIE-PEDVT 和 ICD 代码的分类性能,包括敏感性、特异性、接收者操作特征曲线下的面积。
在内部和外部验证中,易临床信息提取器-肺栓塞/深静脉血栓形成(EasyCIE-PEDVT)检测肺栓塞的敏感性分别为 0.714 和 0.815。在内部和外部验证中,易临床信息提取器-肺栓塞/深静脉血栓形成(EasyCIE-PEDVT)检测深静脉血栓形成的敏感性分别为 0.846 和 0.849。易临床信息提取器-肺栓塞/深静脉血栓形成(EasyCIE-PEDVT)在内部验证中的深静脉血栓形成的判别能力明显高于 ICD 代码(接收者操作特征曲线下面积:0.920 与 0.761;P<0.001)和外部验证(接收者操作特征曲线下面积:0.921 与 0.794;P<0.001)。易临床信息提取器-肺栓塞/深静脉血栓形成(EasyCIE-PEDVT)和 ICD 代码在肺栓塞的判别能力上没有显著差异。
使用 2 个独立医疗保健系统中的临床记录进行自然语言处理,可能实现术后静脉血栓栓塞的准确监测。这些发现表明,自然语言处理可能会增强 NSQIP 等大型注册中心的手动图表提取。