Department of Radiology, Peking University First Hospital, Beijing 100034, China.
Chin Med J (Engl). 2019 Jul 20;132(14):1673-1680. doi: 10.1097/CM9.0000000000000301.
Structured reports are not widely used and thus most reports exist in the form of free text. The process of data extraction by experts is time-consuming and error-prone, whereas data extraction by natural language processing (NLP) is a potential solution that could improve diagnosis efficiency and accuracy. The purpose of this study was to evaluate an NLP program that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) descriptors and final assessment categories from breast magnetic resonance imaging (MRI) reports.
This cross-sectional study involved 2330 breast MRI reports in the electronic medical record from 2009 to 2017. We used 1635 reports for the creation of a revised BI-RADS MRI lexicon and synonyms lists as well as the iterative development of an NLP system. The remaining 695 reports that were not used for developing the system were used as an independent test set for the final evaluation of the NLP system. The recall and precision of an NLP algorithm to detect the revised BI-RADS MRI descriptors and BI-RADS categories from the free-text reports were evaluated against a standard reference of manual human review.
There was a high level of agreement between two manual reviewers, with a κ value of 0.95. For all breast imaging reports, the NLP algorithm demonstrated a recall of 78.5% and a precision of 86.1% for correct identification of the revised BI-RADS MRI descriptors and the BI-RADS categories. NLP generated the total results in <1 s, whereas the manual reviewers averaged 3.38 and 3.23 min per report, respectively.
The NLP algorithm demonstrates high recall and precision for information extraction from free-text reports. This approach will help to narrow the gap between unstructured report text and structured data, which is needed in decision support and other applications.
结构化报告并未得到广泛应用,因此大多数报告仍以自由文本的形式存在。专家进行数据提取的过程既耗时又容易出错,而自然语言处理(NLP)进行数据提取则是一种潜在的解决方案,它可以提高诊断效率和准确性。本研究的目的是评估一种 NLP 程序,该程序可从乳腺磁共振成像(MRI)报告中确定美国放射学院乳腺成像报告和数据系统(BI-RADS)的描述符和最终评估类别。
本横断面研究纳入了 2009 年至 2017 年电子病历中的 2330 份乳腺 MRI 报告。我们使用 1635 份报告来创建修订后的 BI-RADS MRI 词汇表和同义词列表,以及迭代开发 NLP 系统。剩下的 695 份未用于开发系统的报告被用作独立测试集,用于最终评估 NLP 系统。我们评估了 NLP 算法从自由文本报告中检测修订后的 BI-RADS MRI 描述符和 BI-RADS 类别的召回率和准确率,与手动人工审查的标准参考进行了比较。
两位手动审阅者之间存在高度一致性,κ 值为 0.95。对于所有乳腺影像学报告,NLP 算法在正确识别修订后的 BI-RADS MRI 描述符和 BI-RADS 类别方面的召回率为 78.5%,准确率为 86.1%。NLP 在不到 1 秒的时间内生成全部结果,而手动审阅者分别需要 3.38 分钟和 3.23 分钟来完成每份报告。
NLP 算法在从自由文本报告中提取信息方面具有较高的召回率和准确率。这种方法将有助于缩小非结构化报告文本和结构化数据之间的差距,这是决策支持和其他应用程序所需要的。