Lafarge Maxime W, Pluim Josien P W, Eppenhof Koen A J, Veta Mitko
Medical Image Analysis Group, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands.
Front Med (Lausanne). 2019 Jul 16;6:162. doi: 10.3389/fmed.2019.00162. eCollection 2019.
Histological images present high appearance variability due to inconsistent latent parameters related to the preparation and scanning procedure of histological slides, as well as the inherent biological variability of tissues. Machine-learning models are trained with images from a limited set of domains, and are expected to generalize to images from unseen domains. Methodological design choices have to be made in order to yield domain invariance and proper generalization. In digital pathology, standard approaches focus either on normalization of the latent parameters based on prior knowledge, such as staining normalization, or aim at anticipating new variations of these parameters via data augmentation. Since every histological image originates from a unique data distribution, we propose to consider every histological slide of the training data as a domain and investigated the alternative approach of domain-adversarial training to learn features that are invariant to this available domain information. We carried out a comparative analysis with staining normalization and data augmentation on two different tasks: generalization to images acquired in unseen pathology labs for mitosis detection and generalization to unseen organs for nuclei segmentation. We report that the utility of each method depends on the type of task and type of data variability present at training and test time. The proposed framework for domain-adversarial training is able to improve generalization performances on top of conventional methods.
由于与组织切片的制备和扫描程序相关的潜在参数不一致,以及组织固有的生物变异性,组织学图像呈现出高度的外观变异性。机器学习模型是使用来自有限领域集的图像进行训练的,并期望能够推广到来自未知领域的图像。必须做出方法学设计选择,以实现领域不变性和适当的泛化。在数字病理学中,标准方法要么侧重于基于先验知识对潜在参数进行归一化,如染色归一化,要么旨在通过数据增强来预测这些参数的新变化。由于每一幅组织学图像都源自独特的数据分布,我们建议将训练数据中的每一张组织切片视为一个领域,并研究领域对抗训练的替代方法,以学习对可用领域信息不变的特征。我们在两项不同任务上进行了染色归一化和数据增强的比较分析:推广到在未知病理实验室获取的图像以进行有丝分裂检测,以及推广到未知器官以进行细胞核分割。我们报告称,每种方法的效用取决于任务类型以及训练和测试时存在的数据变异性类型。所提出的领域对抗训练框架能够在传统方法的基础上提高泛化性能。