Department of Emergency Medicine, The George Washington University, Washington, DC.
Acad Emerg Med. 2013 Aug;20(8):848-54. doi: 10.1111/acem.12174.
Reliably abstracting outcomes from free-text electronic health records remains a challenge. While automated classification of free text has been a popular medical informatics topic, performance validation using real-world clinical data has been limited. The two main approaches are linguistic (natural language processing [NLP]) and statistical (machine learning). The authors have developed a hybrid system for abstracting computed tomography (CT) reports for specified outcomes.
The objective was to measure performance of a hybrid NLP and machine learning system for automated outcome classification of emergency department (ED) CT imaging reports. The hypothesis was that such a system is comparable to medical personnel doing the data abstraction.
A secondary analysis was performed on a prior diagnostic imaging study on 3,710 blunt facial trauma victims. Staff radiologists dictated CT reports as free text, which were then deidentified. A trained data abstractor manually coded the reference standard outcome of acute orbital fracture, with a random subset double-coded for reliability. The data set was randomly split evenly into training and testing sets. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for certainty and temporal status. Findings were filtered for low certainty and past/future modifiers and then combined with the manual reference standard to generate decision tree classifiers using data mining tools Waikato Environment for Knowledge Analysis (WEKA) 3.7.5 and Salford Predictive Miner 6.6. Performance of decision tree classifiers was evaluated on the testing set with or without NLP processing.
The performance of machine learning alone was comparable to prior NLP studies (sensitivity = 0.92, specificity = 0.93, precision = 0.95, recall = 0.93, f-score = 0.94), and the combined use of NLP and machine learning showed further improvement (sensitivity = 0.93, specificity = 0.97, precision = 0.97, recall = 0.96, f-score = 0.97). This performance is similar to, or better than, that of medical personnel in previous studies.
A hybrid NLP and machine learning automated classification system shows promise in coding free-text electronic clinical data.
从电子病历中可靠地提取结果仍然是一个挑战。虽然自动对自由文本进行分类一直是医学信息学的热门话题,但使用真实临床数据进行性能验证的研究却很有限。主要有两种方法:语言(自然语言处理 [NLP])和统计(机器学习)。作者已经开发了一种用于提取特定结果的 CT 报告的混合系统。
旨在测量用于自动对急诊科 (ED) CT 成像报告进行分类的混合 NLP 和机器学习系统的性能。该假设是,这样的系统可以与进行数据提取的医务人员相媲美。
对先前的 3710 例钝性面部创伤患者的诊断成像研究进行了二次分析。工作人员放射科医生将 CT 报告作为自由文本口述,然后对其进行去识别。经过培训的数据提取人员对急性眼眶骨折的参考标准结果进行了手动编码,随机选择了一部分进行双重编码以确保可靠性。数据集被平均随机分为训练集和测试集。将训练患者报告作为输入输入到 Medical Language Extraction and Encoding (MedLEE) NLP 工具中,以创建包含标准化医学术语和修饰符的结构化输出,用于确定和时间状态。然后过滤低置信度和过去/将来修饰符,并将其与手动参考标准结合起来,使用数据挖掘工具 Waikato Environment for Knowledge Analysis (WEKA) 3.7.5 和 Salford Predictive Miner 6.6 生成决策树分类器。在测试集中评估了不使用或使用 NLP 处理的决策树分类器的性能。
仅使用机器学习的性能与之前的 NLP 研究相似(灵敏度=0.92,特异性=0.93,精度=0.95,召回率=0.93,F1 分数=0.94),而 NLP 和机器学习的结合使用显示出了进一步的提高(灵敏度=0.93,特异性=0.97,精度=0.97,召回率=0.96,F1 分数=0.97)。这种性能与之前研究中的医务人员相似或更好。
混合 NLP 和机器学习的自动分类系统在对自由文本电子临床数据进行编码方面显示出了希望。