Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America.
Herbold Computational Biology Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America.
PLoS Comput Biol. 2023 Nov 20;19(11):e1011664. doi: 10.1371/journal.pcbi.1011664. eCollection 2023 Nov.
T cells rely on their T cell receptors (TCRs) to discern foreign antigens presented by human leukocyte antigen (HLA) proteins. The TCRs of an individual contain a record of this individual's past immune activities, such as immune response to infections or vaccines. Mining the TCR data may recover useful information or biomarkers for immune related diseases or conditions. Some TCRs are observed only in the individuals with certain HLA alleles, and thus characterizing TCRs requires a thorough understanding of TCR-HLA associations. The extensive diversity of HLA alleles and the rareness of some HLA alleles present a formidable challenge for this task. Existing methods either treat HLA as a categorical variable or represent an HLA by its alphanumeric name, and have limited ability to generalize to the HLAs that are not seen in the training process. To address this challenge, we propose a neural network-based method named Deep learning Prediction of TCR-HLA association (DePTH) to predict TCR-HLA associations based on their amino acid sequences. We demonstrate that DePTH is capable of making reasonable predictions for TCR-HLA associations, even when neither the HLA nor the TCR have been included in the training dataset. Furthermore, we establish that DePTH can be used to quantify the functional similarities among HLA alleles, and that these HLA similarities are associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatments.
T 细胞依赖其 T 细胞受体 (TCRs) 来识别人类白细胞抗原 (HLA) 蛋白呈现的外来抗原。个体的 TCR 包含该个体过去免疫活动的记录,例如对感染或疫苗的免疫反应。挖掘 TCR 数据可能会恢复与免疫相关的疾病或病症有关的有用信息或生物标志物。一些 TCR 仅在具有特定 HLA 等位基因的个体中观察到,因此,对 TCR-HLA 关联进行特征描述需要对 TCR-HLA 关联有透彻的了解。HLA 等位基因的广泛多样性和某些 HLA 等位基因的稀有性给这项任务带来了巨大的挑战。现有的方法要么将 HLA 视为分类变量,要么用其字母数字名称表示 HLA,并且很难推广到训练过程中未见过的 HLA。为了解决这个挑战,我们提出了一种基于神经网络的方法,名为 Deep learning Prediction of TCR-HLA association (DePTH),用于根据氨基酸序列预测 TCR-HLA 关联。我们证明,即使在训练数据集中既没有 HLA 也没有 TCR 的情况下,DePTH 也能够对 TCR-HLA 关联进行合理的预测。此外,我们还建立了 DePTH 可用于量化 HLA 等位基因之间的功能相似性,并且这些 HLA 相似性与接受免疫检查点阻断治疗的癌症患者的生存结果相关。