Department of Diagnostic Radiology, University Medical Center Freiburg, Hugstetterstrasse 55, 79106, Freiburg, Germany.
Eur Radiol. 2012 Dec;22(12):2750-8. doi: 10.1007/s00330-012-2608-x. Epub 2012 Aug 4.
To create an advanced image retrieval and data-mining system based on in-house radiology reports.
Radiology reports are semantically analysed using natural language processing (NLP) techniques and stored in a state-of-the-art search engine. Images referenced by sequence and image number in the reports are retrieved from the picture archiving and communication system (PACS) and stored for later viewing. A web-based front end is used as an interface to query for images and show the results with the retrieved images and report text. Using a comprehensive radiological lexicon for the underlying terminology, the search algorithm also finds results for synonyms, abbreviations and related topics.
The test set was 108 manually annotated reports analysed by different system configurations. Best results were achieved using full syntactic and semantic analysis with a precision of 0.929 and recall of 0.952. Operating successfully since October 2010, 258,824 reports have been indexed and a total of 405,146 preview images are stored in the database.
Data-mining and NLP techniques provide quick access to a vast repository of images and radiology reports with both high precision and recall values. Consequently, the system has become a valuable tool in daily clinical routine, education and research.
Radiology reports can now be analysed using sophisticated natural language-processing techniques. Semantic text analysis is backed by terminology of a radiological lexicon. The search engine includes results for synonyms, abbreviations and compositions. Key images are automatically extracted from radiology reports and fetched from PACS. Such systems help to find diagnoses, improve report quality and save time.
创建一个基于内部放射学报告的高级图像检索和数据挖掘系统。
使用自然语言处理(NLP)技术对放射学报告进行语义分析,并将其存储在最先进的搜索引擎中。报告中按序列和图像编号引用的图像从图片存档和通信系统(PACS)中检索,并存储以供以后查看。基于 Web 的前端用作查询图像的接口,并显示检索到的图像和报告文本的结果。使用全面的放射学词汇表作为基础术语,搜索算法还可以找到同义词、缩写和相关主题的结果。
测试集由 108 份手动注释的报告组成,通过不同的系统配置进行分析。使用完整的句法和语义分析,获得了最佳结果,精度为 0.929,召回率为 0.952。自 2010 年 10 月以来,该系统一直成功运行,共索引了 258824 份报告,并在数据库中存储了总计 405146 张预览图像。
数据挖掘和 NLP 技术为快速访问大量图像和放射学报告提供了便利,具有高精度和高召回率。因此,该系统已成为日常临床常规、教育和研究的宝贵工具。
现在可以使用复杂的自然语言处理技术分析放射学报告。语义文本分析得到放射学词汇表术语的支持。搜索引擎包括同义词、缩写和成分的结果。从放射学报告中自动提取关键图像,并从 PACS 中获取。此类系统有助于找到诊断结果、提高报告质量和节省时间。