Kumar Ashnil, Kim Jinman, Lyndon David, Fulham Michael, Feng Dagan
IEEE J Biomed Health Inform. 2017 Jan;21(1):31-40. doi: 10.1109/JBHI.2016.2635663. Epub 2016 Dec 5.
The availability of medical imaging data from clinical archives, research literature, and clinical manuals, coupled with recent advances in computer vision offer the opportunity for image-based diagnosis, teaching, and biomedical research. However, the content and semantics of an image can vary depending on its modality and as such the identification of image modality is an important preliminary step. The key challenge for automatically classifying the modality of a medical image is due to the visual characteristics of different modalities: some are visually distinct while others may have only subtle differences. This challenge is compounded by variations in the appearance of images based on the diseases depicted and a lack of sufficient training data for some modalities. In this paper, we introduce a new method for classifying medical images that uses an ensemble of different convolutional neural network (CNN) architectures. CNNs are a state-of-the-art image classification technique that learns the optimal image features for a given classification task. We hypothesise that different CNN architectures learn different levels of semantic image representation and thus an ensemble of CNNs will enable higher quality features to be extracted. Our method develops a new feature extractor by fine-tuning CNNs that have been initialized on a large dataset of natural images. The fine-tuning process leverages the generic image features from natural images that are fundamental for all images and optimizes them for the variety of medical imaging modalities. These features are used to train numerous multiclass classifiers whose posterior probabilities are fused to predict the modalities of unseen images. Our experiments on the ImageCLEF 2016 medical image public dataset (30 modalities; 6776 training images, and 4166 test images) show that our ensemble of fine-tuned CNNs achieves a higher accuracy than established CNNs. Our ensemble also achieves a higher accuracy than methods in the literature evaluated on the same benchmark dataset and is only overtaken by those methods that source additional training data.
来自临床档案、研究文献和临床手册的医学影像数据,再加上计算机视觉领域的最新进展,为基于图像的诊断、教学和生物医学研究提供了机遇。然而,图像的内容和语义可能因其模态而异,因此识别图像模态是重要的初步步骤。自动对医学图像的模态进行分类面临的关键挑战源于不同模态的视觉特征:有些在视觉上有明显差异,而另一些可能只有细微差别。基于所描绘疾病的图像外观变化以及某些模态缺乏足够的训练数据,这一挑战更加复杂。在本文中,我们介绍了一种用于对医学图像进行分类的新方法,该方法使用了不同卷积神经网络(CNN)架构的集成。CNN是一种先进的图像分类技术,可针对给定的分类任务学习最优的图像特征。我们假设不同的CNN架构学习不同层次的语义图像表示,因此CNN的集成将能够提取更高质量的特征。我们的方法通过微调在大型自然图像数据集上初始化的CNN来开发一种新的特征提取器。微调过程利用自然图像中的通用图像特征,这些特征是所有图像的基础,并针对各种医学成像模态对其进行优化。这些特征用于训练众多多类分类器,其后验概率被融合以预测未见图像的模态。我们在ImageCLEF 2016医学图像公共数据集(30种模态;6776张训练图像和4166张测试图像)上的实验表明,我们的微调CNN集成比已有的CNN具有更高的准确率。我们的集成在同一基准数据集上评估时,也比文献中的方法具有更高的准确率,并且仅被那些获取额外训练数据的方法超越。