Department of Mechatronics and Biomedical Engineering, Faculty of Engineering and Science, Lee Kong Chian, Universiti Tunku Abdul Rahman, Kampar, Malaysia.
Department of Electrical and Electronic Engineering, Faculty of Engineering and Science, Lee Kong Chian, Universiti Tunku Abdul Rahman, Kampar, Malaysia.
Sci Rep. 2023 Nov 22;13(1):20518. doi: 10.1038/s41598-023-46619-6.
Debates persist regarding the impact of Stain Normalization (SN) on recent breast cancer histopathological studies. While some studies propose no influence on classification outcomes, others argue for improvement. This study aims to assess the efficacy of SN in breast cancer histopathological classification, specifically focusing on Invasive Ductal Carcinoma (IDC) grading using Convolutional Neural Networks (CNNs). The null hypothesis asserts that SN has no effect on the accuracy of CNN-based IDC grading, while the alternative hypothesis suggests the contrary. We evaluated six SN techniques, with five templates selected as target images for the conventional SN techniques. We also utilized seven ImageNet pre-trained CNNs for IDC grading. The performance of models trained with and without SN was compared to discern the influence of SN on classification outcomes. The analysis unveiled a p-value of 0.11, indicating no statistically significant difference in Balanced Accuracy Scores between models trained with StainGAN-normalized images, achieving a score of 0.9196 (the best-performing SN technique), and models trained with non-normalized images, which scored 0.9308. As a result, we did not reject the null hypothesis, indicating that we found no evidence to support a significant discrepancy in effectiveness between stain-normalized and non-normalized datasets for IDC grading tasks. This study demonstrates that SN has a limited impact on IDC grading, challenging the assumption of performance enhancement through SN.
关于 Stain Normalization(SN)对最近乳腺癌组织病理学研究的影响,一直存在争议。虽然一些研究认为对分类结果没有影响,但也有研究认为有改善作用。本研究旨在评估 SN 在乳腺癌组织病理学分类中的功效,特别是使用卷积神经网络(CNN)评估浸润性导管癌(IDC)分级。零假设断言 SN 对基于 CNN 的 IDC 分级的准确性没有影响,而备择假设则相反。我们评估了六种 SN 技术,其中五个模板被选为传统 SN 技术的目标图像。我们还使用了七个 ImageNet 预训练的 CNN 进行 IDC 分级。比较了使用和不使用 SN 训练的模型的性能,以辨别 SN 对分类结果的影响。分析揭示了 p 值为 0.11,表明在使用 StainGAN 归一化图像训练的模型的平衡准确率得分与使用非归一化图像训练的模型之间没有统计学上的显著差异,前者得分为 0.9196(表现最好的 SN 技术),后者得分为 0.9308。因此,我们没有拒绝零假设,这表明我们没有发现证据支持 SN 归一化和非归一化数据集在 IDC 分级任务中的有效性存在显著差异。本研究表明 SN 对 IDC 分级的影响有限,这对通过 SN 提高性能的假设提出了挑战。