基于Transformer 的结直肠癌组织学生物标志物预测:一项大规模多中心研究。
Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study.
机构信息
Helmholtz Munich - German Research Center for Environment and Health, Munich, Germany; School of Computation, Information and Technology, Technical University of Munich, Munich, Germany; Else Kroener Fresenius Center for Digital Health (EFFZ), Technical University Dresden, Dresden, Germany.
Helmholtz Munich - German Research Center for Environment and Health, Munich, Germany.
出版信息
Cancer Cell. 2023 Sep 11;41(9):1650-1661.e4. doi: 10.1016/j.ccell.2023.08.002. Epub 2023 Aug 30.
Deep learning (DL) can accelerate the prediction of prognostic biomarkers from routine pathology slides in colorectal cancer (CRC). However, current approaches rely on convolutional neural networks (CNNs) and have mostly been validated on small patient cohorts. Here, we develop a new transformer-based pipeline for end-to-end biomarker prediction from pathology slides by combining a pre-trained transformer encoder with a transformer network for patch aggregation. Our transformer-based approach substantially improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training and evaluating on a large multicenter cohort of over 13,000 patients from 16 colorectal cancer cohorts, we achieve a sensitivity of 0.99 with a negative predictive value of over 0.99 for prediction of microsatellite instability (MSI) on surgical resection specimens. We demonstrate that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem.
深度学习 (DL) 可以加速从结直肠癌 (CRC) 的常规病理切片中预测预后生物标志物。然而,目前的方法依赖于卷积神经网络 (CNN),并且大多在小患者队列上进行了验证。在这里,我们通过将预训练的转换器编码器与用于斑块聚合的转换器网络相结合,开发了一种新的基于转换器的端到端生物标志物预测流水线,用于从病理切片。与当前最先进的算法相比,我们的基于转换器的方法在性能、泛化能力、数据效率和可解释性方面都有了显著的提高。在对来自 16 个结直肠癌队列的超过 13000 名患者的大型多中心队列进行训练和评估后,我们在手术切除标本上预测微卫星不稳定性 (MSI) 的灵敏度达到 0.99,阴性预测值超过 0.99。我们证明,仅基于切除标本的训练在内镜活检组织上达到了临床级性能,解决了一个长期存在的诊断问题。