IEEE Trans Med Imaging. 2023 Jun;42(6):1809-1821. doi: 10.1109/TMI.2023.3241204. Epub 2023 Jun 1.
Whole-slide image (WSI) classification is fundamental to computational pathology, which is challenging in extra-high resolution, expensive manual annotation, data heterogeneity, etc. Multiple instance learning (MIL) provides a promising way towards WSI classification, which nevertheless suffers from the memory bottleneck issue inherently, due to the gigapixel high resolution. To avoid this issue, the overwhelming majority of existing approaches have to decouple the feature encoder and the MIL aggregator in MIL networks, which may largely degrade the performance. Towards this end, this paper presents a Bayesian Collaborative Learning (BCL) framework to address the memory bottleneck issue with WSI classification. Our basic idea is to introduce an auxiliary patch classifier to interact with the target MIL classifier to be learned, so that the feature encoder and the MIL aggregator in the MIL classifier can be learned collaboratively while preventing the memory bottleneck issue. Such a collaborative learning procedure is formulated under a unified Bayesian probabilistic framework and a principled Expectation-Maximization algorithm is developed to infer the optimal model parameters iteratively. As an implementation of the E-step, an effective quality-aware pseudo labeling strategy is also suggested. The proposed BCL is extensively evaluated on three publicly available WSI datasets, i.e., CAMELYON16, TCGA-NSCLC and TCGA-RCC, achieving an AUC of 95.6%, 96.0% and 97.5% respectively, which consistently outperforms all the methods compared. Comprehensive analysis and discussion will also be presented for in-depth understanding of the method. To promote future work, our source code is released at: https://github.com/Zero-We/BCL.
全切片图像 (WSI) 分类是计算病理学的基础,在超高分辨率、昂贵的手动标注、数据异质性等方面具有挑战性。多实例学习 (MIL) 为 WSI 分类提供了一种很有前途的方法,但由于千兆像素的高分辨率,它仍然存在内存瓶颈问题。为了避免这个问题,现有的绝大多数方法都必须在 MIL 网络中分离特征编码器和 MIL 聚合器,这可能会大大降低性能。为此,本文提出了一种贝叶斯协同学习 (BCL) 框架,以解决 WSI 分类中的内存瓶颈问题。我们的基本思想是引入一个辅助的补丁分类器与要学习的目标 MIL 分类器进行交互,以便在 MIL 分类器中同时学习特征编码器和 MIL 聚合器,同时防止内存瓶颈问题。这种协同学习过程是在一个统一的贝叶斯概率框架下制定的,并开发了一种基于原则的期望最大化算法来迭代推断最优模型参数。作为 E 步的实现,还提出了一种有效的基于质量感知的伪标签策略。所提出的 BCL 在三个公开的 WSI 数据集上进行了广泛评估,即 CAMELYON16、TCGA-NSCLC 和 TCGA-RCC,分别实现了 95.6%、96.0%和 97.5%的 AUC,一致优于所有比较的方法。还将进行全面的分析和讨论,以深入了解该方法。为了促进未来的工作,我们的源代码已发布在:https://github.com/Zero-We/BCL。