利用EfficientNetV1和EfficientNetV2模型集成进行乳腺癌组织病理学图像的分类与解读。

Leveraging an ensemble of EfficientNetV1 and EfficientNetV2 models for classification and interpretation of breast cancer histopathology images.

作者信息

Azmoodeh-Kalati Mahdi, Shabani Hasti, Maghareh Mohammad Sadegh, Barzegar Zeynab, Lashgari Reza

机构信息

Institute of Medical Science and Technology, Shahid Beheshti University, Tehran, Iran.

Computer Science Department, Amirkabir University of Technology, Tehran, Iran.

出版信息

Sci Rep. 2025 Jul 1;15(1):21541. doi: 10.1038/s41598-025-06853-6.

DOI:10.1038/s41598-025-06853-6

PMID:40596576

Abstract

Breast cancer is the second leading cause of cancer-related deaths among women, following lung cancer, as of 2024. Conventional cancer diagnosis relies on the manual examination of biopsied tissues by pathologists, a time-consuming process that may vary based on individual expertise. Early detection and accurate diagnosis are crucial for effective treatment planning and patient care. The advent of whole-slide scanners has revolutionized this process by enabling the use of Computer-Aided Detection (CAD) systems for automated analysis. In this study, we utilize state-of-the-art Convolutional Neural Networks (CNNs), specifically EfficientNetV1 and EfficientNetV2, for the binary classification of the BreakHis dataset-a collection of histopathological images categorized as benign or malignant breast tissues. To address the challenge of limited annotated data, we apply data augmentation and transfer learning techniques. Model interpretability is enhanced using the Grad-CAM technique, which generates localization maps highlighting critical regions relevant to predictions. Furthermore, ensemble learning is employed to improve classification performance. We utilize unweighted averaging and majority voting to combine predictions from multiple trained models. Additionally, we propose two ensemble architectures that integrate different trained EfficientNet models. Our framework achieves a classification accuracy of 99.58%, outperforming conventional CNN models on the BreakHis dataset. This study highlights the potential of ensemble learning to enhance diagnostic accuracy in breast cancer detection.

摘要

截至2024年，乳腺癌是女性癌症相关死亡的第二大主要原因，仅次于肺癌。传统的癌症诊断依赖于病理学家对活检组织进行人工检查，这是一个耗时的过程，可能因个人专业知识而异。早期检测和准确诊断对于有效的治疗计划和患者护理至关重要。全切片扫描仪的出现通过启用计算机辅助检测（CAD）系统进行自动分析，彻底改变了这一过程。在本研究中，我们利用先进的卷积神经网络（CNN），特别是EfficientNetV1和EfficientNetV2，对BreakHis数据集进行二元分类，该数据集是一组分类为良性或恶性乳腺组织的组织病理学图像。为了应对注释数据有限的挑战，我们应用了数据增强和迁移学习技术。使用Grad-CAM技术增强了模型的可解释性，该技术生成突出显示与预测相关的关键区域的定位图。此外，采用集成学习来提高分类性能。我们利用无加权平均和多数投票来组合多个训练模型的预测。此外，我们提出了两种集成架构，将不同训练的EfficientNet模型集成在一起。我们的框架在BreakHis数据集上实现了99.58%的分类准确率，优于传统的CNN模型。这项研究突出了集成学习在提高乳腺癌检测诊断准确性方面的潜力。