Müller Markus, Gfeller David, Coukos George, Bassani-Sternberg Michal
Vital-IT, Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Front Immunol. 2017 Oct 20;8:1367. doi: 10.3389/fimmu.2017.01367. eCollection 2017.
The remarkable clinical efficacy of the immune checkpoint blockade therapies has motivated researchers to discover immunogenic epitopes and exploit them for personalized vaccines. Human leukocyte antigen (HLA)-binding peptides derived from processing and presentation of mutated proteins are one of the leading targets for T-cell recognition of cancer cells. Currently, most studies attempt to identify neoantigens based on predicted affinity to HLA molecules, but the performance of such prediction algorithms is rather poor for rare HLA class I alleles and for HLA class II. Direct identification of neoantigens by mass spectrometry (MS) is becoming feasible; however, it is not yet applicable to most patients and lacks sensitivity. In an attempt to capitalize on existing immunopeptidomics data and extract information that could complement HLA-binding prediction, we first compiled a large HLA class I and class II immunopeptidomics database across dozens of cell types and HLA allotypes and detected hotspots that are subsequences of proteins frequently presented. About 3% of the peptidome was detected in both class I and class II. Based on the gene ontology of their source proteins and the peptide's length, we propose that their processing may partake by the cellular class II presentation machinery. Our database captures the global nature of the peptidome averaged over many HLA alleles, and therefore, reflects the propensity of peptides to be presented on HLA complexes, which is complementary to the existing neoantigen prediction features such as binding affinity and stability or RNA abundance. We further introduce two immunopeptidomics MS-based features to guide prioritization of neoantigens: the number of peptides matching a protein in our database and the overlap of the predicted wild-type peptide with other peptides in our database. We show as a proof of concept that our immunopeptidomics MS-based features improved neoantigen prioritization by up to 50%. Overall, our work shows that, in addition to providing huge training data to improve the HLA binding prediction, immunopeptidomics also captures other aspects of the natural presentation that significantly improve prediction of clinically relevant neoantigens.
免疫检查点阻断疗法卓越的临床疗效促使研究人员去发现免疫原性表位,并将其用于个性化疫苗。源自突变蛋白加工与呈递的人类白细胞抗原(HLA)结合肽是T细胞识别癌细胞的主要靶点之一。目前,大多数研究试图基于对HLA分子的预测亲和力来鉴定新抗原,但此类预测算法对于罕见的HLA I类等位基因和HLA II类的性能相当差。通过质谱(MS)直接鉴定新抗原正变得可行;然而,它尚未适用于大多数患者且缺乏敏感性。为了利用现有的免疫肽组学数据并提取可补充HLA结合预测的信息,我们首先汇编了一个涵盖数十种细胞类型和HLA同种异型的大型HLA I类和II类免疫肽组学数据库,并检测了作为频繁呈递蛋白质子序列的热点。在I类和II类中均检测到约3%的肽组。基于其来源蛋白的基因本体和肽的长度,我们提出它们的加工可能由细胞II类呈递机制参与。我们的数据库捕获了许多HLA等位基因平均后的肽组的全局性质,因此反映了肽呈递于HLA复合物上的倾向,这与现有的新抗原预测特征如结合亲和力、稳定性或RNA丰度互补。我们进一步引入两个基于免疫肽组学MS的特征来指导新抗原的优先级排序:我们数据库中与一种蛋白质匹配的肽的数量以及预测的野生型肽与我们数据库中其他肽的重叠情况。我们作为概念验证表明,我们基于免疫肽组学MS的特征将新抗原优先级排序提高了多达50%。总体而言,我们的工作表明,免疫肽组学除了提供大量训练数据以改善HLA结合预测外,还捕获了自然呈递的其他方面,显著改善了临床相关新抗原的预测。