Barbosa Flavio, Traina Agma Jucci, Muglia Valdair Francisco
Valdair Muglia, MD., Ph.D., Universidade de Sao Paulo Ribeirao Preto School of Medicine, Internal Medicine, Av Bandeirantes 3900, Campus Monte Alegre, Ribeirao Preto, Sao Paulo 14049900, Brazil, Email:
Appl Clin Inform. 2016 Aug 24;7(3):803-16. doi: 10.4338/ACI-2016-03-RA-0037.
A structured report for imaging exams aims at increasing the precision in information retrieval and communication between physicians. However, it is more concise than free text and may limit specialists' descriptions of important findings not covered by pre-defined structures. A computational ontological structure derived from free texts designed by specialists may be a solution for this problem. Therefore, the goal of our study was to develop a methodology for structuring information in radiology reports covering specifications required for the Brazilian Portuguese language, including the terminology to be used.
We gathered 1,701 radiological reports of magnetic resonance imaging (MRI) studies of the lumbosacral spine from three different institutions. Techniques of text mining and ontological conceptualization of lexical units extracted were used to structure information. Ten radiologists, specialists in lumbosacral MRI, evaluated the textual superstructure and terminology extracted using an electronic questionnaire.
The established methodology consists of six steps: 1) collection of radiology reports of a specific MRI examination; 2) textual decomposition; 3) normalization of lexical units; 4) identification of textual superstructures; 5) conceptualization of candidate-terms; and 6) evaluation of superstructures and extracted terminology by experts using an electronic questionnaire. Three different textual superstructures were identified, with terminological variations in the names of their textual categories. The number of candidate-terms conceptualized was 4,183, yielding 727 concepts. There were a total of 13,963 relationships between candidate-terms and concepts and 789 relationships among concepts.
The proposed methodology allowed structuring information in a more intuitive and practical way. Indications of three textual superstructures, extraction of lexicon units and the normalization and ontologically conceptualization were achieved while maintaining references to their respective categories and free text radiology reports.
影像学检查的结构化报告旨在提高医生之间信息检索和交流的准确性。然而,它比自由文本更简洁,可能会限制专家对预定义结构未涵盖的重要发现的描述。由专家设计的从自由文本派生的计算本体结构可能是解决此问题的一种方法。因此,我们研究的目的是开发一种方法,用于构建涵盖巴西葡萄牙语所需规范(包括要使用的术语)的放射学报告中的信息。
我们从三个不同机构收集了1701份腰骶椎磁共振成像(MRI)研究的放射学报告。使用文本挖掘技术和对提取的词汇单位进行本体概念化来构建信息。十名腰骶椎MRI专家放射科医生使用电子问卷评估了提取的文本上层结构和术语。
既定方法包括六个步骤:1)收集特定MRI检查的放射学报告;2)文本分解;3)词汇单位的规范化;4)文本上层结构的识别;5)候选术语的概念化;6)专家使用电子问卷对上层结构和提取的术语进行评估。识别出三种不同的文本上层结构,其文本类别名称存在术语差异。概念化的候选术语数量为4183个,产生727个概念。候选术语与概念之间共有13963个关系,概念之间有789个关系。
所提出的方法允许以更直观和实用的方式构建信息。在保持对各自类别和自由文本放射学报告的引用的同时,实现了三种文本上层结构的指示、词汇单位的提取以及规范化和本体概念化。