IEEE J Biomed Health Inform. 2016 Jan;20(1):281-92. doi: 10.1109/JBHI.2014.2375491. Epub 2014 Nov 25.
Content-based image retrieval (CBIR) is a search technology that could aid medical diagnosis by retrieving and presenting earlier reported cases that are related to the one being diagnosed. To retrieve relevant cases, CBIR systems depend on supervised learning to map low-level image contents to high-level diagnostic concepts. However, the annotation by medical doctors for training and evaluation purposes is a difficult and time-consuming task, which restricts the supervised learning phase to specific CBIR problems of well-defined clinical applications. This paper proposes a new technique that automatically learns the similarity between the several exams from textual distances extracted from radiology reports, thereby successfully reducing the number of annotations needed. Our method first infers the relation between patients by using information retrieval techniques to determine the textual distances between patient radiology reports. These distances are subsequently used to supervise a metric learning algorithm, that transforms the image space accordingly to textual distances. CBIR systems with different image descriptions and different levels of medical annotations were evaluated, with and without supervision from textual distances, using a database of computer tomography scans of patients with interstitial lung diseases. The proposed method consistently improves CBIR mean average precision, with improvements that can reach 38%, and more marked gains for small annotation sets. Given the overall availability of radiology reports in picture archiving and communication systems, the proposed approach can be broadly applied to CBIR systems in different medical problems, and may facilitate the introduction of CBIR in clinical practice.
基于内容的图像检索(CBIR)是一种搜索技术,通过检索和呈现与正在诊断的病例相关的先前报告的病例,可以辅助医学诊断。为了检索相关病例,CBIR 系统依赖于监督学习将低级图像内容映射到高级诊断概念。然而,医生的注释对于培训和评估目的是一项困难且耗时的任务,这限制了监督学习阶段到特定的、定义明确的临床应用的 CBIR 问题。本文提出了一种新技术,该技术可以通过从放射学报告中提取的文本距离自动学习几次检查之间的相似性,从而成功减少所需的注释数量。我们的方法首先通过使用信息检索技术来推断患者之间的关系,从而确定患者放射学报告之间的文本距离。然后,这些距离被用来监督度量学习算法,该算法相应地转换图像空间以匹配文本距离。使用患有间质性肺病的患者的计算机断层扫描数据库,评估了具有不同图像描述和不同级别的医学注释的 CBIR 系统,同时也评估了是否从文本距离进行监督。所提出的方法始终提高了 CBIR 的平均精度,改进幅度可达 38%,并且在注释集较小时可以获得更显著的收益。考虑到放射学报告在图片存档和通信系统中的总体可用性,所提出的方法可以广泛应用于不同医学问题的 CBIR 系统,并可能促进 CBIR 在临床实践中的引入。