Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; email:
Life Science Zurich Graduate School, ETH Zurich and University of Zurich, 8006 Zurich, Switzerland.
Annu Rev Chem Biomol Eng. 2021 Jun 7;12:39-62. doi: 10.1146/annurev-chembioeng-101420-125021. Epub 2021 Apr 14.
Adaptive immunity is mediated by lymphocyte B and T cells, which respectively express a vast and diverse repertoire of B cell and T cell receptors and, in conjunction with peptide antigen presentation through major histocompatibility complexes (MHCs), can recognize and respond to pathogens and diseased cells. In recent years, advances in deep sequencing have led to a massive increase in the amount of adaptive immune receptor repertoire data; additionally, proteomics techniques have led to a wealth of data on peptide-MHC presentation. These large-scale data sets are now making it possible to train machine and deep learning models, which can be used to identify complex and high-dimensional patterns in immune repertoires. This article introduces adaptive immune repertoires and machine and deep learning related to biological sequence data and then summarizes the many applications in this field, which span from predicting the immunological status of a host to the antigen specificity of individual receptors and the engineering of immunotherapeutics.
适应性免疫由淋巴细胞 B 和 T 细胞介导,它们分别表达大量多样化的 B 细胞和 T 细胞受体,与通过主要组织相容性复合体 (MHC) 进行的肽抗原呈递相结合,能够识别和响应病原体和病变细胞。近年来,深度测序技术的进步使得适应性免疫受体库数据的数量大大增加;此外,蛋白质组学技术还产生了大量关于肽-MHC 呈递的数据。这些大规模数据集现在使得训练机器和深度学习模型成为可能,这些模型可用于识别免疫受体库中的复杂和高维模式。本文介绍了与生物序列数据相关的适应性免疫受体库和机器学习和深度学习,然后总结了该领域的许多应用,涵盖了从预测宿主的免疫状态到个体受体的抗原特异性和免疫疗法的工程。