Bouteldja Nassim, Hölscher David L, Bülow Roman D, Roberts Ian S D, Coppo Rosanna, Boor Peter
Institute of Pathology, RWTH Aachen University Hospital, Aachen, Germany.
Department of Cellular Pathology, Oxford University Hospitals National Health Service Foundation Trust, Oxford, United Kingdom.
J Pathol Inform. 2022 Sep 13;13:100140. doi: 10.1016/j.jpi.2022.100140. eCollection 2022.
Considerable inter- and intra-laboratory stain variability exists in pathology, representing a challenge in development and application of deep learning (DL) approaches. Since tackling all sources of stain variability with manual annotation is not feasible, we here investigated and compared unsupervised DL approaches to reduce the consequences of stain variability in kidney pathology.
We aimed to improve the applicability of a pretrained DL segmentation model to 3 external multi-centric cohorts with large stain variability. In contrast to the traditional approach of training generative adversarial networks (GAN) for stain normalization, we here propose to tackle stain variability by data augmentation. We augment the training data of the pretrained model by the stain variability using CycleGANs and then retrain the model on the stain-augmented dataset. We compared the performance of i/ the unmodified pretrained segmentation model with ii/ CycleGAN-based stain normalization, iii/ a feature-preserving modification to ii/ for improved normalization, and iv/ the proposed stain-augmented model.
The proposed stain-augmented model showed highest mean segmentation accuracy in all external cohorts and maintained comparable performance on the training cohort. However, the increase in performance was only marginal compared to the pretrained model. CycleGAN-based stain normalization suffered from encoded imperceptible information into the normalizations that confused the pretrained model and thus resulted in slightly worse performance.
Our findings suggest that stain variability can be tackled more effectively by augmenting data by it than by following the commonly used approach of normalizing the stain. However, the applicability of this approach providing only a rather slight performance increase has to be weighted against an additional carbon footprint.
病理学中实验室间和实验室内染色变异性很大,这对深度学习(DL)方法的开发和应用构成了挑战。由于通过手动注释解决染色变异性的所有来源是不可行的,我们在此研究并比较了无监督DL方法,以减少肾脏病理学中染色变异性的影响。
我们旨在提高预训练的DL分割模型对3个具有较大染色变异性的外部多中心队列的适用性。与训练生成对抗网络(GAN)进行染色归一化的传统方法不同,我们在此提出通过数据增强来解决染色变异性问题。我们使用CycleGANs通过染色变异性增强预训练模型的训练数据,然后在染色增强的数据集上重新训练模型。我们比较了i/未修改的预训练分割模型与ii/基于CycleGAN的染色归一化、iii/对ii/进行的保留特征的改进以实现更好的归一化以及iv/所提出的染色增强模型的性能。
所提出的染色增强模型在所有外部队列中显示出最高的平均分割准确率,并且在训练队列上保持了可比的性能。然而,与预训练模型相比,性能提升仅为微小幅度。基于CycleGAN的染色归一化存在将不可察觉的信息编码到归一化中,这使预训练模型产生混淆,从而导致性能略差。
我们的研究结果表明,通过对数据进行增强来解决染色变异性比采用常用的染色归一化方法更有效。然而,这种方法仅带来相当轻微的性能提升,其适用性必须与额外的碳足迹进行权衡。