Center for Access and Delivery Research and Evaluation, Iowa City Veterans Affairs Medical Center (J.E.F., C.F., M.V.-S., B.C.L., S.G.), Department of Medicine, University of Iowa Carver College of Medicine, Iowa City.
Division of Cardiovascular Diseases, Massachusetts General Hospital, Boston (A.H.Q.).
Circ Cardiovasc Interv. 2022 Mar;15(3):e011092. doi: 10.1161/CIRCINTERVENTIONS.121.011092. Epub 2022 Feb 18.
Despite its high prevalence and clinical impact, research on peripheral artery disease (PAD) remains limited due to poor accuracy of billing codes. Ankle-brachial index (ABI) and toe-brachial index can be used to identify PAD patients with high accuracy within electronic health records.
We developed a novel natural language processing (NLP) algorithm for extracting ABI and toe-brachial index values and laterality (right or left) from ABI reports. A random sample of 800 reports from 94 Veterans Affairs facilities during 2015 to 2017 was selected and annotated by clinical experts. We trained the NLP system using random forest models and optimized it through sequential iterations of 10-fold cross-validation and error analysis on 600 test reports and evaluated its final performance on a separate set of 200 reports. We also assessed the accuracy of NLP-extracted ABI and toe-brachial index values for identifying patients with PAD in a separate cohort undergoing ABI testing.
The NLP system had an overall precision (positive predictive value) of 0.85, recall (sensitivity) of 0.93, and F1 measure (accuracy) of 0.89 to correctly identify ABI/toe-brachial index values and laterality. Among 261 patients with ABI testing (49% PAD), the NLP system achieved a positive predictive value of 92.3%, sensitivity of 83.1%, and specificity of 93.1% to identify PAD when compared with a structured chart review. The above findings were consistent in a range of sensitivity analysis.
We successfully developed and validated an NLP system for identifying patients with PAD within the Veterans Affairs electronic health record. Our findings have broad implications for PAD research and quality improvement.
尽管外周动脉疾病(PAD)的患病率和临床影响都很高,但由于计费代码的准确性较差,相关研究仍然有限。踝肱指数(ABI)和趾肱指数可用于在电子健康记录中准确识别 PAD 患者。
我们开发了一种新的自然语言处理(NLP)算法,用于从 ABI 报告中提取 ABI 和趾肱指数值以及侧别(右或左)。从 2015 年至 2017 年期间的 94 家退伍军人事务设施中选择了 800 份随机报告,并由临床专家进行注释。我们使用随机森林模型对 NLP 系统进行了训练,并通过在 600 份测试报告上进行 10 倍交叉验证和错误分析的顺序迭代来对其进行优化,并在单独的 200 份报告上评估其最终性能。我们还评估了 NLP 提取的 ABI 和趾肱指数值在识别接受 ABI 测试的患者中的准确性。
NLP 系统总体精度(阳性预测值)为 0.85,召回率(灵敏度)为 0.93,F1 度量(准确性)为 0.89,可正确识别 ABI/趾肱指数值和侧别。在 261 名接受 ABI 测试的患者(49%的 PAD)中,与结构化图表审查相比,NLP 系统在识别 PAD 时的阳性预测值为 92.3%,灵敏度为 83.1%,特异性为 93.1%。在一系列敏感性分析中,这些发现是一致的。
我们成功地开发并验证了一种 NLP 系统,用于在退伍军人事务电子健康记录中识别 PAD 患者。我们的研究结果对外周动脉疾病研究和质量改进具有广泛的意义。