Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA.
J Digit Imaging. 2012 Aug;25(4):512-9. doi: 10.1007/s10278-012-9463-9.
Radiology reports are permanent legal documents that serve as official interpretation of imaging tests. Manual analysis of textual information contained in these reports requires significant time and effort. This study describes the development and initial evaluation of a toolkit that enables automated identification of relevant information from within these largely unstructured text reports. We developed and made publicly available a natural language processing toolkit, Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). Core functions are included in the following modules: the Data Loader, Header Extractor, Terminology Interface, Reviewer, and Analyzer. The toolkit enables search for specific terms and retrieval of (radiology) reports containing exact term matches as well as similar or synonymous term matches within the text of the report. The Terminology Interface is the main component of the toolkit. It allows query expansion based on synonyms from a controlled terminology (e.g., RadLex or National Cancer Institute Thesaurus [NCIT]). We evaluated iSCOUT document retrieval of radiology reports that contained liver cysts, and compared precision and recall with and without using NCIT synonyms for query expansion. iSCOUT retrieved radiology reports with documented liver cysts with a precision of 0.92 and recall of 0.96, utilizing NCIT. This recall (i.e., utilizing the Terminology Interface) is significantly better than using each of two search terms alone (0.72, p=0.03 for liver cyst and 0.52, p=0.0002 for hepatic cyst). iSCOUT reliably assembled relevant radiology reports for a cohort of patients with liver cysts with significant improvement in document retrieval when utilizing controlled lexicons.
放射学报告是具有法律效力的永久性文件,可作为影像学检查的官方解读。手动分析这些报告中包含的文本信息需要大量的时间和精力。本研究描述了一个工具包的开发和初步评估,该工具包能够自动识别这些主要是非结构化文本报告中的相关信息。我们开发并公开发布了一个自然语言处理工具包,即利用本体论进行搜索内容的信息工具包(iSCOUT)。核心功能包含在以下模块中:数据加载器、标题提取器、术语接口、审阅者和分析器。该工具包能够搜索特定术语,并检索包含确切术语匹配项以及报告文本中相似或同义词匹配项的(放射学)报告。术语接口是工具包的主要组件。它允许根据受控术语(例如 RadLex 或国家癌症研究所词汇表 [NCIT])中的同义词进行查询扩展。我们评估了 iSCOUT 对包含肝囊肿的放射学报告的文档检索,并比较了使用和不使用 NCIT 同义词进行查询扩展的精度和召回率。使用 NCIT 时,iSCOUT 检索到包含有记录的肝囊肿的放射学报告的精度为 0.92,召回率为 0.96。与单独使用两个搜索词中的每一个(肝囊肿为 0.72,p=0.03;肝囊肿为 0.52,p=0.0002)相比,这种召回率(即使用术语接口)显著提高。iSCOUT 可靠地为一组患有肝囊肿的患者组装了相关的放射学报告,在利用受控词汇时,文档检索有了显著的提高。