Department of Chemistry & Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States.
Front Immunol. 2022 Apr 25;13:887759. doi: 10.3389/fimmu.2022.887759. eCollection 2022.
There is long-standing interest in accurately modeling the structural features of peptides bound and presented by class I MHC proteins. This interest has grown with the advent of rapid genome sequencing and the prospect of personalized, peptide-based cancer vaccines, as well as the development of molecular and cellular therapeutics based on T cell receptor recognition of peptide-MHC. However, while the speed and accessibility of peptide-MHC modeling has improved substantially over the years, improvements in accuracy have been modest. Accuracy is crucial in peptide-MHC modeling, as T cell receptors are highly sensitive to peptide conformation and capturing fine details is therefore necessary for useful models. Studying nonameric peptides presented by the common class I MHC protein HLA-A02:01, here we addressed a key question common to modern modeling efforts: from a set of models (or decoys) generated through conformational sampling, which is best? We found that the common strategy of decoy selection by lowest energy can lead to substantial errors in predicted structures. We therefore adopted a data-driven approach and trained functions capable of predicting near native decoys with exceptionally high accuracy. Although our implementation is limited to nonamer/HLA-A02:01 complexes, our results serve as an important proof of concept from which improvements can be made and, given the significance of HLA-A*02:01 and its preference for nonameric peptides, should have immediate utility in select immunotherapeutic and other efforts for which structural information would be advantageous.
人们一直以来都对准确建模与 I 类主要组织相容性复合体(MHC)蛋白结合并呈递的肽结构特征有着浓厚的兴趣。随着快速基因组测序的出现,以及基于肽的个性化癌症疫苗的前景,以及基于 T 细胞受体识别肽-MHC 的分子和细胞治疗的发展,这种兴趣与日俱增。然而,尽管肽-MHC 建模的速度和可及性多年来有了实质性的提高,但准确性的提高却相当有限。在肽-MHC 建模中,准确性至关重要,因为 T 细胞受体对肽构象高度敏感,因此需要捕捉细微细节,才能得到有用的模型。在这里,我们研究了由常见的 I 类 MHC 蛋白 HLA-A02:01 呈递的九聚肽,以解决现代建模工作中的一个关键问题:在通过构象采样生成的模型(或诱饵)中,哪一个是最好的?我们发现,通过最低能量选择诱饵的常见策略可能会导致预测结构中出现严重错误。因此,我们采用了一种数据驱动的方法,并训练了能够以极高精度预测近天然诱饵的函数。虽然我们的实现仅限于九聚体/HLA-A02:01 复合物,但我们的结果提供了一个重要的概念证明,从中可以进行改进,并且鉴于 HLA-A*02:01 的重要性及其对九聚肽的偏好,在某些需要结构信息的免疫治疗和其他方面应该具有直接的应用价值。