Department of Electronics and Communication Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, Delhi, India.
Phys Eng Sci Med. 2021 Sep;44(3):655-665. doi: 10.1007/s13246-021-01013-2. Epub 2021 May 20.
Recognition of tissues and organs is a recurrent step performed by experts during analyses of histological images. With advancement in the field of machine learning, such steps can be automated using computer vision methods. This paper presents an ensemble-based approach for improved classification of non-pathological tissues and organs in histological images using convolutional neural networks (CNNs). With limited dataset size, we relied upon transfer learning where pre-trained CNNs are re-used for new classification problems. The transfer learning was done using eleven CNN architectures upon 6000 image patches constituting training and validation subsets of a public dataset containing six cardiovascular categories. The CNN models were fine-tuned upon a much larger dataset obtained by augmenting training subset to obtain agreeable performance on validation subset. Lastly, we created various ensembles of trained classifiers and evaluate them on testing subset of 7500 patches. The best ensemble classifier gives, precision, recall, and accuracy of 0.876, 0.869 and 0.869, respectively upon test images. With an overall F-score of 0.870, our ensemble-based approach outperforms previous approaches with single fine-tuned CNN, CNN trained from scratch, and traditional machine learning by 0.019, 0.064 and 0.183, respectively. Ensemble approach can perform better than individual classifier-based ones, provided the constituent classifiers are chosen wisely. The empirical choice of classifiers reinforces the intuition that models which are newer and outperformed in their native domain are more likely to outperform in transferred-domain, since the best ensemble dominantly consists of more lately proposed and better architectures.
组织和器官识别是专家在分析组织学图像时经常进行的步骤。随着机器学习领域的发展,这些步骤可以使用计算机视觉方法实现自动化。本文提出了一种基于集成的方法,使用卷积神经网络(CNN)来改善组织学图像中非病理性组织和器官的分类。由于数据集规模有限,我们依赖于迁移学习,即在新的分类问题中重新使用预先训练的 CNN。迁移学习使用了十一个 CNN 架构,对由 6000 个图像块组成的公共数据集的训练和验证子集进行了操作,该数据集包含六个心血管类别。通过在训练子集中进行扩充以在验证子集中获得可接受的性能,对 CNN 模型进行了微调。最后,我们创建了各种训练分类器的集成,并在 7500 个补丁的测试子集中对其进行评估。最佳集成分类器在测试图像上的精度、召回率和准确率分别为 0.876、0.869 和 0.869。总体 F 分数为 0.870,我们的基于集成的方法优于以前的方法,包括单个微调的 CNN、从头开始训练的 CNN 和传统的机器学习方法,分别高出 0.019、0.064 和 0.183。只要组成分类器明智选择,集成方法就可以比单个基于分类器的方法表现更好。分类器的经验选择强化了这样一种直觉,即在其原始领域中表现更好的较新模型更有可能在转移领域中表现更好,因为最佳集成主要由更新和更好的架构组成。