Institute for Biological Physics, University of Cologne, Cologne, Germany.
Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.
PLoS Comput Biol. 2024 Sep 3;20(9):e1012380. doi: 10.1371/journal.pcbi.1012380. eCollection 2024 Sep.
Molecules of the Major Histocompatibility Complex (MHC) present short protein fragments on the cell surface, an important step in T cell immune recognition. MHC-I molecules process peptides from intracellular proteins; MHC-II molecules act in antigen-presenting cells and present peptides derived from extracellular proteins. Here we show that the sequence-dependent energy landscapes of MHC-peptide binding encode class-specific nonlinearities (epistasis). MHC-I has a smooth landscape with global epistasis; the binding energy is a simple deformation of an underlying linear trait. This form of epistasis enhances the discrimination between strong-binding peptides. In contrast, MHC-II has a rugged landscape with idiosyncratic epistasis: binding depends on detailed amino acid combinations at multiple positions of the peptide sequence. The form of epistasis affects the learning of energy landscapes from training data. For MHC-I, a low-complexity problem, we derive a simple matrix model of binding energies that outperforms current models trained by machine learning. For MHC-II, higher complexity prevents learning by simple regression methods. Epistasis also affects the energy and fitness effects of mutations in antigen-derived peptides (epitopes). In MHC-I, large-effect mutations occur predominantly in anchor positions of strong-binding epitopes. In MHC-II, large effects depend on the background epitope sequence but are broadly distributed over the epitope, generating a bigger target for escape mutations due to loss of presentation. Together, our analysis shows how an energy landscape of protein-protein binding constrains the target of escape mutations from T cell immunity, linking the complexity of the molecular interactions to the dynamics of adaptive immune response.
主要组织相容性复合体 (MHC) 的分子在细胞表面呈现短的蛋白质片段,这是 T 细胞免疫识别的重要步骤。MHC-I 分子处理来自细胞内蛋白质的肽;MHC-II 分子在抗原呈递细胞中起作用,并呈现来自细胞外蛋白质的肽。在这里,我们表明 MHC-肽结合的序列依赖性能量景观编码了类特异性的非线性(上位性)。MHC-I 具有平滑的景观和全局上位性;结合能是对基础线性特征的简单变形。这种上位性形式增强了对强结合肽的区分。相比之下,MHC-II 具有崎岖的景观和独特的上位性:结合取决于肽序列中多个位置的详细氨基酸组合。上位性的形式影响从训练数据中学习能量景观的方式。对于 MHC-I,这是一个低复杂度的问题,我们推导出一种简单的结合能矩阵模型,其性能优于通过机器学习训练的当前模型。对于 MHC-II,更高的复杂性阻止了简单回归方法的学习。上位性还影响抗原衍生肽(表位)中的突变的能量和适应性效应。在 MHC-I 中,大效应突变主要发生在强结合表位的锚定位置。在 MHC-II 中,大效应取决于背景表位序列,但广泛分布在表位中,由于呈递丢失而产生更大的逃逸突变目标。总的来说,我们的分析表明蛋白质-蛋白质结合的能量景观如何限制 T 细胞免疫逃逸突变的靶标,将分子相互作用的复杂性与适应性免疫反应的动力学联系起来。