School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China.
Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, China.
Front Immunol. 2024 Mar 7;15:1345586. doi: 10.3389/fimmu.2024.1345586. eCollection 2024.
T cell receptor (TCR) repertoires provide valuable insights into complex human diseases, including cancers. Recent advancements in immune sequencing technology have significantly improved our understanding of TCR repertoire. Some computational methods have been devised to identify cancer-associated TCRs and enable cancer detection using TCR sequencing data. However, the existing methods are often limited by their inadequate consideration of the correlations among TCRs within a repertoire, hindering the identification of crucial TCRs. Additionally, the sparsity of cancer-associated TCR distribution presents a challenge in accurate prediction.
To address these issues, we presented DeepLION2, an innovative deep multi-instance contrastive learning framework specifically designed to enhance cancer-associated TCR prediction. DeepLION2 leveraged content-based sparse self-attention, focusing on the top related TCRs for each TCR, to effectively model inter-TCR correlations. Furthermore, it adopted a contrastive learning strategy for bootstrapping parameter updates of the attention matrix, preventing the model from fixating on non-cancer-associated TCRs.
Extensive experimentation on diverse patient cohorts, encompassing over ten cancer types, demonstrated that DeepLION2 significantly outperformed current state-of-the-art methods in terms of accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the curve (AUC). Notably, DeepLION2 achieved impressive AUC values of 0.933, 0.880, and 0.763 on thyroid, lung, and gastrointestinal cancer cohorts, respectively. Furthermore, it effectively identified cancer-associated TCRs along with their key motifs, highlighting the amino acids that play a crucial role in TCR-peptide binding.
These compelling results underscore DeepLION2's potential for enhancing cancer detection and facilitating personalized cancer immunotherapy. DeepLION2 is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION2, for academic use only.
T 细胞受体(TCR)库为包括癌症在内的复杂人类疾病提供了有价值的见解。免疫测序技术的最新进展极大地提高了我们对 TCR 库的理解。已经设计了一些计算方法来识别与癌症相关的 TCR,并利用 TCR 测序数据进行癌症检测。然而,现有的方法往往受到其对库内 TCR 相关性考虑不足的限制,从而阻碍了关键 TCR 的识别。此外,癌症相关 TCR 分布的稀疏性也是准确预测的一个挑战。
为了解决这些问题,我们提出了 DeepLION2,这是一种专门用于增强癌症相关 TCR 预测的创新深度多实例对比学习框架。DeepLION2利用基于内容的稀疏自注意力,关注每个 TCR 的前 相关 TCR,以有效地对 TCR 之间的相关性进行建模。此外,它采用了对比学习策略来引导注意力矩阵的参数更新,防止模型固定在非癌症相关的 TCR 上。
在包含十种以上癌症类型的不同患者队列上进行了广泛的实验,结果表明 DeepLION2 在准确性、敏感性、特异性、马修斯相关系数和曲线下面积(AUC)方面均显著优于当前最先进的方法。值得注意的是,DeepLION2 在甲状腺、肺癌和胃肠道癌队列上的 AUC 值分别达到了 0.933、0.880 和 0.763。此外,它还有效地识别了癌症相关的 TCR 及其关键基序,突出了在 TCR-肽结合中起关键作用的氨基酸。
这些令人信服的结果突显了 DeepLION2 在增强癌症检测和促进个性化癌症免疫治疗方面的潜力。DeepLION2 可在 GitHub 上公开获取,网址为 https://github.com/Bioinformatics7181/DeepLION2,仅供学术使用。