Yang Zhaochang, Wei Ting, Liang Ying, Yuan Xin, Gao RuiTian, Xia Yujia, Zhou Jie, Zhang Yue, Yu Zhangsheng
Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
Nat Commun. 2025 Mar 10;16(1):2366. doi: 10.1038/s41467-025-57587-y.
Computational pathology, utilizing whole slide images (WSIs) for pathological diagnosis, has advanced the development of intelligent healthcare. However, the scarcity of annotated data and histological differences hinder the general application of existing methods. Extensive histopathological data and the robustness of self-supervised models in small-scale data demonstrate promising prospects for developing foundation pathology models. Here we show BEPH (BEiT-based model Pre-training on Histopathological image), a foundation model that leverages self-supervised learning to learn meaningful representations from 11 million unlabeled histopathological images. These representations are then efficiently adapted to various tasks, including patch-level cancer diagnosis, WSI-level cancer classification, and survival prediction for multiple cancer subtypes. By leveraging the masked image modeling (MIM) pre-training approach, BEPH offers an efficient solution to enhance model performance, reduce the reliance on expert annotations, and facilitate the broader application of artificial intelligence in clinical settings. The pre-trained model is available at https://github.com/Zhcyoung/BEPH .
利用全切片图像(WSIs)进行病理诊断的计算病理学推动了智能医疗的发展。然而,标注数据的稀缺和组织学差异阻碍了现有方法的广泛应用。大量的组织病理学数据以及自监督模型在小规模数据中的稳健性为开发基础病理学模型展示了广阔前景。在此,我们展示了BEPH(基于BEiT的组织病理学图像预训练模型),这是一个利用自监督学习从1100万张未标注的组织病理学图像中学习有意义表征的基础模型。这些表征随后被有效地应用于各种任务,包括斑块级癌症诊断、WSI级癌症分类以及多种癌症亚型的生存预测。通过利用掩码图像建模(MIM)预训练方法,BEPH提供了一种有效的解决方案,以提高模型性能、减少对专家标注的依赖,并促进人工智能在临床环境中的更广泛应用。预训练模型可在https://github.com/Zhcyoung/BEPH获取。