Yadav Kabir, Sarioglu Efsun, Choi Hyeong Ah, Cartwright Walter B, Hinds Pamela S, Chamberlain James M
Department of Emergency Medicine, Harbor-UCLA Medical Center, Torrance, CA.
Computer Science Department, Portland State University, Portland, OR.
Acad Emerg Med. 2016 Feb;23(2):171-8. doi: 10.1111/acem.12859. Epub 2016 Jan 14.
The authors have previously demonstrated highly reliable automated classification of free-text computed tomography (CT) imaging reports using a hybrid system that pairs linguistic (natural language processing) and statistical (machine learning) techniques. Previously performed for identifying the outcome of orbital fracture in unprocessed radiology reports from a clinical data repository, the performance has not been replicated for more complex outcomes.
To validate automated outcome classification performance of a hybrid natural language processing (NLP) and machine learning system for brain CT imaging reports. The hypothesis was that our system has performance characteristics for identifying pediatric traumatic brain injury (TBI).
This was a secondary analysis of a subset of 2,121 CT reports from the Pediatric Emergency Care Applied Research Network (PECARN) TBI study. For that project, radiologists dictated CT reports as free text, which were then deidentified and scanned as PDF documents. Trained data abstractors manually coded each report for TBI outcome. Text was extracted from the PDF files using optical character recognition. The data set was randomly split evenly for training and testing. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for negation, certainty, and temporal status. A random subset stratified by site was analyzed using descriptive quantitative content analysis to confirm identification of TBI findings based on the National Institute of Neurological Disorders and Stroke (NINDS) Common Data Elements project. Findings were coded for presence or absence, weighted by frequency of mentions, and past/future/indication modifiers were filtered. After combining with the manual reference standard, a decision tree classifier was created using data mining tools WEKA 3.7.5 and Salford Predictive Miner 7.0. Performance of the decision tree classifier was evaluated on the test patient reports.
The prevalence of TBI in the sampled population was 159 of 2,217 (7.2%). The automated classification for pediatric TBI is comparable to our prior results, with the notable exception of lower positive predictive value. Manual review of misclassified reports, 95.5% of which were false-positives, revealed that a sizable number of false-positive errors were due to differing outcome definitions between NINDS TBI findings and PECARN clinical important TBI findings and report ambiguity not meeting definition criteria.
A hybrid NLP and machine learning automated classification system continues to show promise in coding free-text electronic clinical data. For complex outcomes, it can reliably identify negative reports, but manual review of positive reports may be required. As such, it can still streamline data collection for clinical research and performance improvement.
作者先前已证明,使用将语言(自然语言处理)和统计(机器学习)技术相结合的混合系统,能够对自由文本计算机断层扫描(CT)成像报告进行高度可靠的自动分类。此前该系统用于从临床数据存储库中未经处理的放射学报告中识别眼眶骨折的结果,但尚未在更复杂的结果识别中得到验证。
验证用于脑CT成像报告的自然语言处理(NLP)与机器学习混合系统的自动结果分类性能。假设是我们的系统具有识别小儿创伤性脑损伤(TBI)的性能特征。
这是对来自儿科急诊应用研究网络(PECARN)TBI研究的2121份CT报告子集的二次分析。对于该项目,放射科医生将CT报告口述为自由文本,然后进行去识别处理并扫描为PDF文档。训练有素的数据提取人员对每份报告的TBI结果进行人工编码。使用光学字符识别从PDF文件中提取文本。数据集被随机均匀地分为训练集和测试集。将训练患者报告用作医学语言提取与编码(MedLEE)NLP工具的输入,以创建包含标准化医学术语以及否定、确定性和时间状态修饰词的结构化输出。使用描述性定量内容分析对按部位分层的随机子集进行分析,以确认基于美国国立神经疾病和中风研究所(NINDS)通用数据元素项目对TBI发现的识别。对发现结果的存在与否进行编码,按提及频率加权,并过滤过去/未来/指示修饰词。与人工参考标准相结合后,使用数据挖掘工具WEKA 3.7.5和Salford Predictive Miner 7.0创建决策树分类器。在测试患者报告上评估决策树分类器的性能。
抽样人群中TBI的患病率为2217人中的159人(7.2%)。小儿TBI的自动分类与我们之前的结果相当,但阳性预测值较低是个明显例外。对误分类报告的人工审查显示,其中95.5%为假阳性,相当数量的假阳性错误是由于NINDS TBI发现与PECARN临床重要TBI发现之间的结果定义不同以及报告含糊不清不符合定义标准所致。
NLP与机器学习自动分类混合系统在对自由文本电子临床数据进行编码方面仍显示出前景。对于复杂结果,它能够可靠地识别阴性报告,但可能需要对阳性报告进行人工审查。因此,它仍可简化临床研究和性能改进的数据收集工作。