Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamil Nadu, India.
Department of Computer Science & Engineering, Anna University, Chennai-600025, Tamil Nadu, India.
J Digit Imaging. 2021 Jun;34(3):618-629. doi: 10.1007/s10278-021-00456-z. Epub 2021 May 10.
Computer aided detection (CADe) and computer aided diagnostic (CADx) systems are ongoing research areas for identifying lesions among complex inner structures with different pixel intensities, and for medical image classification. There are several techniques available for breast cancer detection and diagnosis using CADe and CADx systems. However, some of these systems are not accurate enough or suffer from lack of sufficient data. For example, mammography is the most commonly used breast cancer detection technique, and there are several CADe and CADx systems based on mammography, because of the huge dataset that is publicly available. But, the number of cancers escaping detection with mammography is substantial, particularly in dense-breasted women. On the other hand, digital breast tomosynthesis (DBT) is a new imaging technique, which alleviates the limitations of the mammography technique. However, the collections of huge amounts of the DBT images are difficult as it is not publicly available. In such cases, the concept of transfer learning can be employed. The knowledge learned from a trained source domain task, whose dataset is readily available, is transferred to improve the learning in the target domain task, whose dataset may be scarce. In this paper, a two-level framework is developed for the classification of the DBT datasets. A basic multilevel transfer learning (MLTL) based framework is proposed to use the knowledge learned from general non-medical image datasets and the mammography dataset, to train and classify the target DBT dataset. A feature extraction based transfer learning (FETL) framework is proposed to further improve the classification performance of the MLTL based framework. The FETL framework looks at three different feature extraction techniques to augment the MLTL based framework performance. The area under receiver operating characteristic (ROC) curve of value 0.89 is obtained, with just 2.08% of the source domain (non-medical) dataset, 5.09% of the intermediate domain (mammography) dataset, and 3.94% of the target domain (DBT) dataset, when compared to the dataset reported in literature.
计算机辅助检测(CADe)和计算机辅助诊断(CADx)系统是识别具有不同像素强度的复杂内部结构中的病变以及进行医学图像分类的研究领域。有几种技术可用于使用 CADe 和 CADx 系统进行乳腺癌检测和诊断。但是,这些系统中的某些系统不够准确,或者由于缺乏足够的数据而受到限制。例如,乳房 X 线摄影术是最常用的乳腺癌检测技术,有几种基于乳房 X 线摄影术的 CADe 和 CADx 系统,因为可公开获得大量数据集。但是,通过乳房 X 线摄影术检测到的癌症数量仍然很多,尤其是在乳腺密度高的女性中。另一方面,数字乳腺断层合成术(DBT)是一种新的成像技术,可缓解乳房 X 线摄影术的局限性。但是,由于 DBT 图像的大量采集困难,因此它无法公开获得。在这种情况下,可以采用迁移学习的概念。从具有现成数据集的已训练源域任务中学习的知识可用于改进目标域任务的学习,而目标域任务的数据可能很少。在本文中,开发了用于 DBT 数据集分类的两级框架。提出了一种基于多级迁移学习(MLTL)的基本框架,以利用从一般非医学图像数据集和乳房 X 线摄影术数据集中学到的知识,对目标 DBT 数据集进行训练和分类。提出了一种基于特征提取的迁移学习(FETL)框架,以进一步提高基于 MLTL 的框架的分类性能。FETL 框架着眼于三种不同的特征提取技术来增强基于 MLTL 的框架的性能。与文献中报道的数据集相比,当仅使用 2.08%的源域(非医学)数据集,5.09%的中间域(乳房 X 线摄影术)数据集和 3.94%的目标域(DBT)数据集时,接收器操作特征(ROC)曲线下的面积值为 0.89。