Tafavvoghi Masoud, Sildnes Anders, Rakaee Mehrdad, Shvetsov Nikita, Bongo Lars Ailo, Busund Lill-Tove Rasmussen, Møllersen Kajsa
Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway.
Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway.
J Pathol Inform. 2024 Nov 17;16:100410. doi: 10.1016/j.jpi.2024.100410. eCollection 2025 Jan.
Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting model. The pipeline was tested on 221 hold-out WSIs, achieving an F1 score of 0.95 for tumor vs non-tumor classification and a macro F1 score of 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.
对乳腺癌分子亚型进行分类对于制定个性化治疗策略至关重要。虽然免疫组织化学(IHC)和基因表达谱分析是分子亚型分类的标准方法,但免疫组织化学可能存在主观性,并且基因谱分析成本高昂,在许多地区无法广泛应用。先前的方法强调了深度学习模型在苏木精和伊红(H&E)染色的全切片图像(WSIs)上用于分子亚型分类的潜在应用,但这些努力在方法、数据集和报告的性能方面存在差异。在这项工作中,我们研究了是否可以仅利用H&E染色的全切片图像来预测乳腺癌分子亚型(腔面A型、B型、HER2富集型和基底型)。我们在一个两步流程中使用了1433张乳腺癌全切片图像:首先,对肿瘤和非肿瘤切片进行分类,仅使用肿瘤区域进行分子亚型分类;其次,采用一对多(OvR)策略训练四个二元OvR分类器,并使用极端梯度提升模型汇总它们的结果。该流程在221张保留的全切片图像上进行了测试,肿瘤与非肿瘤分类的F1分数为0.95,分子亚型分类的宏F1分数为0.73。我们的研究结果表明,经过进一步验证,监督深度学习模型可以作为乳腺癌分子亚型分类的辅助工具。我们提供了代码以促进正在进行的研究和开发。