IEEE J Biomed Health Inform. 2017 Nov;21(6):1625-1632. doi: 10.1109/JBHI.2017.2691738. Epub 2017 Apr 6.
Epithelium-stroma classification is a necessary preprocessing step in histopathological image analysis. Current deep learning based recognition methods for histology data require collection of large volumes of labeled data in order to train a new neural network when there are changes to the image acquisition procedure. However, it is extremely expensive for pathologists to manually label sufficient volumes of data for each pathology study in a professional manner, which results in limitations in real-world applications. A very simple but effective deep learning method, that introduces the concept of unsupervised domain adaptation to a simple convolutional neural network (CNN), has been proposed in this paper. Inspired by transfer learning, our paper assumes that the training data and testing data follow different distributions, and there is an adaptation operation to more accurately estimate the kernels in CNN in feature extraction, in order to enhance performance by transferring knowledge from labeled data in source domain to unlabeled data in target domain. The model has been evaluated using three independent public epithelium-stroma datasets by cross-dataset validations. The experimental results demonstrate that for epithelium-stroma classification, the proposed framework outperforms the state-of-the-art deep neural network model, and it also achieves better performance than other existing deep domain adaptation methods. The proposed model can be considered to be a better option for real-world applications in histopathological image analysis, since there is no longer a requirement for large-scale labeled data in each specified domain.
上皮-基质分类是组织病理学图像分析中必要的预处理步骤。目前,基于深度学习的组织学数据识别方法需要收集大量的标记数据,以便在图像采集过程发生变化时训练新的神经网络。然而,病理学家以专业的方式手动标记每个病理学研究足够数量的数据是极其昂贵的,这导致了在实际应用中的限制。本文提出了一种非常简单但有效的深度学习方法,该方法将无监督领域自适应的概念引入到简单的卷积神经网络(CNN)中。受迁移学习的启发,我们的论文假设训练数据和测试数据遵循不同的分布,并且存在一个自适应操作,以便更准确地估计 CNN 中的核函数在特征提取中,从而通过从源域中的标记数据向目标域中的未标记数据转移知识来提高性能。该模型已经使用三个独立的公共上皮-基质数据集通过交叉数据集验证进行了评估。实验结果表明,对于上皮-基质分类,所提出的框架优于最先进的深度神经网络模型,并且它也优于其他现有的深度领域自适应方法的性能。由于不再需要在每个指定领域中使用大规模标记数据,因此该模型可以被认为是组织病理学图像分析中实际应用的更好选择。