Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, Paris, France.
Immuno-Oncology Service, Human Oncology and Pathogenesis Program, Hepatopancreatobiliary Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, New York State, United States of America.
PLoS Comput Biol. 2021 Sep 2;17(9):e1009297. doi: 10.1371/journal.pcbi.1009297. eCollection 2021 Sep.
With the increasing ability to use high-throughput next-generation sequencing to quantify the diversity of the human T cell receptor (TCR) repertoire, the ability to use TCR sequences to infer antigen-specificity could greatly aid potential diagnostics and therapeutics. Here, we use a machine-learning approach known as Restricted Boltzmann Machine to develop a sequence-based inference approach to identify antigen-specific TCRs. Our approach combines probabilistic models of TCR sequences with clone abundance information to extract TCR sequence motifs central to an antigen-specific response. We use this model to identify patient personalized TCR motifs that respond to individual tumor and infectious disease antigens, and to accurately discriminate specific from non-specific responses. Furthermore, the hidden structure of the model results in an interpretable representation space where TCRs responding to the same antigen cluster, correctly discriminating the response of TCR to different viral epitopes. The model can be used to identify condition specific responding TCRs. We focus on the examples of TCRs reactive to candidate neoantigens and selected epitopes in experiments of stimulated TCR clone expansion.
随着高通量下一代测序技术定量分析人类 T 细胞受体(TCR)多样性的能力不断提高,利用 TCR 序列推断抗原特异性的能力将极大地帮助潜在的诊断和治疗方法。在这里,我们使用一种称为受限玻尔兹曼机的机器学习方法来开发一种基于序列的推断方法来识别抗原特异性的 TCR。我们的方法将 TCR 序列的概率模型与克隆丰度信息相结合,以提取与抗原特异性反应相关的 TCR 序列基序。我们使用该模型识别针对个体肿瘤和传染病抗原的患者个性化 TCR 基序,并准确区分特异性和非特异性反应。此外,该模型的隐藏结构导致了一个可解释的表示空间,其中对同一抗原产生反应的 TCR 聚集在一起,正确地区分 TCR 对不同病毒表位的反应。该模型可用于识别条件特异性反应的 TCR。我们专注于 TCR 对候选新抗原和刺激 TCR 克隆扩增实验中选定表位的反应的例子。