Suppr超能文献

一种自然语言处理(NLP)工具从结构化和半结构化退伍军人事务部(VA)数据中提取肺功能测试(PFT)报告的性能。

Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data.

作者信息

Sauer Brian C, Jones Barbara E, Globe Gary, Leng Jianwei, Lu Chao-Chin, He Tao, Teng Chia-Chen, Sullivan Patrick, Zeng Qing

机构信息

Salt Lake IDEAS Center, Veteran Affairs; Division of Epidemiology, Department of Internal Medicine, School of Medicine, University of Utah.

Amgen Inc.

出版信息

EGEMS (Wash DC). 2016 Jun 1;4(1):1217. doi: 10.13063/2327-9214.1217. eCollection 2016.

Abstract

INTRODUCTION/OBJECTIVE: Pulmonary function tests (PFTs) are objective estimates of lung function, but are not reliably stored within the Veteran Health Affairs data systems as structured data. The aim of this study was to validate the natural language processing (NLP) tool we developed-which extracts spirometric values and responses to bronchodilator administration-against expert review, and to estimate the number of additional spirometric tests identified beyond the structured data.

METHODS

All patients at seven Veteran Affairs Medical Centers with a diagnostic code for asthma Jan 1, 2006-Dec 31, 2012 were included. Evidence of spirometry with a bronchodilator challenge (BDC) was extracted from structured data as well as clinical documents. NLP's performance was compared against a human reference standard using a random sample of 1,001 documents.

RESULTS

In the validation set NLP demonstrated a precision of 98.9 percent (95 percent confidence intervals (CI): 93.9 percent, 99.7 percent), recall of 97.8 percent (95 percent CI: 92.2 percent, 99.7 percent), and an F-measure of 98.3 percent for the forced vital capacity pre- and post pairs and precision of 100 percent (95 percent CI: 96.6 percent, 100 percent), recall of 100 percent (95 percent CI: 96.6 percent, 100 percent), and an F-measure of 100 percent for the forced expiratory volume in one second pre- and post pairs for bronchodilator administration. Application of the NLP increased the proportion identified with complete bronchodilator challenge by 25 percent.

DISCUSSION/CONCLUSION: This technology can improve identification of PFTs for epidemiologic research. Caution must be taken in assuming that a single domain of clinical data can completely capture the scope of a disease, treatment, or clinical test.

摘要

引言/目的:肺功能测试(PFTs)是对肺功能的客观评估,但在退伍军人健康管理局的数据系统中并未作为结构化数据可靠存储。本研究的目的是针对专家评审验证我们开发的自然语言处理(NLP)工具(该工具可提取肺活量测定值和支气管扩张剂给药反应),并估计在结构化数据之外识别出的额外肺活量测定测试的数量。

方法

纳入2006年1月1日至2012年12月31日期间在七个退伍军人事务医疗中心有哮喘诊断代码的所有患者。从结构化数据以及临床文档中提取支气管扩张剂激发试验(BDC)肺活量测定的证据。使用1001份文档的随机样本将NLP的性能与人类参考标准进行比较。

结果

在验证集中,NLP对于用力肺活量前后配对的精度为98.9%(95%置信区间(CI):93.9%,99.7%),召回率为97.8%(95%CI:92.2%,99.7%),F值为98.3%;对于支气管扩张剂给药后一秒用力呼气量前后配对的精度为100%(95%CI:96.6%,100%),召回率为100%(95%CI:96.6%,100%),F值为100%。NLP的应用使识别出的完成支气管扩张剂激发试验的比例增加了25%。

讨论/结论:这项技术可改善用于流行病学研究的肺功能测试的识别。在假设临床数据的单个领域能够完全涵盖疾病、治疗或临床测试的范围时必须谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c89/4909376/ef917f6759cb/egems1217f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验