College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China.
Institute of Biomedical Engineering and Technology, Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, No. 500 Dongchuan Road, Shanghai, 200241, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae420.
The T cell receptor (TCR) repertoire is pivotal to the human immune system, and understanding its nuances can significantly enhance our ability to forecast cancer-related immune responses. However, existing methods often overlook the intra- and inter-sequence interactions of T cell receptors (TCRs), limiting the development of sequence-based cancer-related immune status predictions. To address this challenge, we propose BertTCR, an innovative deep learning framework designed to predict cancer-related immune status using TCRs. BertTCR combines a pre-trained protein large language model with deep learning architectures, enabling it to extract deeper contextual information from TCRs. Compared to three state-of-the-art sequence-based methods, BertTCR improves the AUC on an external validation set for thyroid cancer detection by 21 percentage points. Additionally, this model was trained on over 2000 publicly available TCR libraries covering 17 types of cancer and healthy samples, and it has been validated on multiple public external datasets for its ability to distinguish cancer patients from healthy individuals. Furthermore, BertTCR can accurately classify various cancer types and healthy individuals. Overall, BertTCR is the advancing method for cancer-related immune status forecasting based on TCRs, offering promising potential for a wide range of immune status prediction tasks.
T 细胞受体 (TCR) 库对于人体免疫系统至关重要,深入了解其细微差别可以显著提高我们预测癌症相关免疫反应的能力。然而,现有的方法往往忽略了 T 细胞受体 (TCR) 的内部和序列间相互作用,限制了基于序列的癌症相关免疫状态预测的发展。为了解决这一挑战,我们提出了 BertTCR,这是一种创新的深度学习框架,旨在使用 TCR 预测癌症相关免疫状态。BertTCR 将经过预训练的蛋白质大型语言模型与深度学习架构相结合,使其能够从 TCR 中提取更深层次的上下文信息。与三种最先进的基于序列的方法相比,BertTCR 提高了甲状腺癌检测的外部验证集的 AUC 达 21 个百分点。此外,该模型在超过 2000 个公共 TCR 文库上进行了训练,涵盖了 17 种癌症和健康样本,并在多个公共外部数据集上进行了验证,以证明其区分癌症患者和健康个体的能力。此外,BertTCR 可以准确地对各种癌症类型和健康个体进行分类。总的来说,BertTCR 是一种基于 TCR 的癌症相关免疫状态预测的先进方法,为广泛的免疫状态预测任务提供了有前途的潜力。