Jiang Shan, Su Zhaoqian, Bloodworth Nathaniel, Liu Yunchao, Martina Cristina, Harrison David G, Meiler Jens
Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, TN, United States.
Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States.
bioRxiv. 2024 Nov 21:2024.11.19.624425. doi: 10.1101/2024.11.19.624425.
Class 1 major histocompatibility complexes (MHC-I), encoded by the highly polymorphic HLA-A, HLA-B, and HLA-C genes in humans, are expressed on all nucleated cells. Both self and foreign proteins are processed to peptides of 8 to 10 amino acids, loaded into MCH-1 within the endoplasmic reticulum and then presented on the cell surface. Foreign peptides presented in this fashion activate CD8+ T cells and their immunogenicity correlates with their affinity for the MHC-1 binding groove. Thus, predicting antigen binding affinity for MHC-I is a valuable tool for identifying potentially immunogenic antigens. While quite a few predictors for MHC-I binding exist, there are no currently available tools that can predict antigen/MHC-I binding affinity for antigens with explicitly labeled post-translational modifications or unusual/non-canonical amino acids (NCAAs). However, such modifications are increasingly recognized as critical mediators of peptide immunogenicity. In this work, we propose a machine learning application that quantifies the binding affinity of epitopes containing NCAAs to MHC-I and compares its performance with other commonly used regressors. Our model demonstrates robust performance, with 5-fold cross-validation yielding an R value of 0.477 and a root-mean-square error (RMSE) of 0.735, indicating strong predictive capability for peptides with NCAAs. This work provides a valuable tool for the computational design and optimization of peptides incorporating NCAAs, potentially accelerating the development of novel peptide-based therapeutics with enhanced properties and efficacy.
1类主要组织相容性复合体(MHC-I)由人类高度多态性的HLA-A、HLA-B和HLA-C基因编码,在所有有核细胞上表达。自身和外来蛋白质都会被加工成8至10个氨基酸的肽段,在内质网中装载到MCH-1中,然后呈递到细胞表面。以这种方式呈递的外来肽段会激活CD8+T细胞,其免疫原性与其对MHC-1结合槽的亲和力相关。因此,预测抗原与MHC-I的结合亲和力是鉴定潜在免疫原性抗原的一个有价值的工具。虽然存在不少预测MHC-I结合的工具,但目前尚无可用工具能够预测具有明确标记的翻译后修饰或不寻常/非标准氨基酸(NCAA)的抗原与MHC-I的结合亲和力。然而,这种修饰越来越被认为是肽免疫原性的关键介质。在这项工作中,我们提出了一种机器学习应用程序,用于量化含有NCAA的表位与MHC-I的结合亲和力,并将其性能与其他常用回归器进行比较。我们的模型表现出强大的性能,5折交叉验证产生的R值为0.477,均方根误差(RMSE)为0.735,表明对含有NCAA的肽段具有很强的预测能力。这项工作为包含NCAA的肽段的计算设计和优化提供了一个有价值的工具,有可能加速具有增强特性和功效的新型肽基疗法的开发。