National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China.
School of Informatics, Xiamen University, Xiamen, Fujian 361005, China.
Cell Rep Methods. 2024 Jun 17;4(6):100797. doi: 10.1016/j.crmeth.2024.100797.
Cancer of unknown primary (CUP) represents metastatic cancer where the primary site remains unidentified despite standard diagnostic procedures. To determine the tumor origin in such cases, we developed BPformer, a deep learning method integrating the transformer model with prior knowledge of biological pathways. Trained on transcriptomes from 10,410 primary tumors across 32 cancer types, BPformer achieved remarkable accuracy rates of 94%, 92%, and 89% in primary tumors and primary and metastatic sites of metastatic tumors, respectively, surpassing existing methods. Additionally, BPformer was validated in a retrospective study, demonstrating consistency with tumor sites diagnosed through immunohistochemistry and histopathology. Furthermore, BPformer was able to rank pathways based on their contribution to tumor origin identification, which helped to classify oncogenic signaling pathways into those that are highly conservative among different cancers versus those that are highly variable depending on their origins.
原发灶不明癌(CUP)代表转移性癌症,尽管经过标准诊断程序,原发灶仍未确定。为了确定此类情况下的肿瘤起源,我们开发了 BPformer,这是一种将转换器模型与生物途径的先验知识相结合的深度学习方法。在跨越 32 种癌症类型的 10410 个原发肿瘤的转录组上进行训练后,BPformer 在原发肿瘤和转移性肿瘤的原发和转移部位的准确率分别达到了 94%、92%和 89%,超过了现有方法。此外,BPformer 在回顾性研究中得到了验证,与通过免疫组织化学和组织病理学诊断的肿瘤部位具有一致性。此外,BPformer 还能够根据其对肿瘤起源识别的贡献对途径进行排序,这有助于将致癌信号通路分为在不同癌症中高度保守的通路和根据起源高度变化的通路。