Suppr超能文献

利用大规模免疫肽组学和 MHC 肽呈递的复合模型进行精准新抗原发现。

Precision Neoantigen Discovery Using Large-Scale Immunopeptidomes and Composite Modeling of MHC Peptide Presentation.

机构信息

Personalis, Inc, Menlo Park, California, USA.

Department of Genetics, Stanford University, Palo Alto, California, USA.

出版信息

Mol Cell Proteomics. 2023 Apr;22(4):100506. doi: 10.1016/j.mcpro.2023.100506. Epub 2023 Feb 14.

Abstract

Major histocompatibility complex (MHC)-bound peptides that originate from tumor-specific genetic alterations, known as neoantigens, are an important class of anticancer therapeutic targets. Accurately predicting peptide presentation by MHC complexes is a key aspect of discovering therapeutically relevant neoantigens. Technological improvements in mass spectrometry-based immunopeptidomics and advanced modeling techniques have vastly improved MHC presentation prediction over the past 2 decades. However, improvement in the accuracy of prediction algorithms is needed for clinical applications like the development of personalized cancer vaccines, the discovery of biomarkers for response to immunotherapies, and the quantification of autoimmune risk in gene therapies. Toward this end, we generated allele-specific immunopeptidomics data using 25 monoallelic cell lines and created Systematic Human Leukocyte Antigen (HLA) Epitope Ranking Pan Algorithm (SHERPA), a pan-allelic MHC-peptide algorithm for predicting MHC-peptide binding and presentation. In contrast to previously published large-scale monoallelic data, we used an HLA-null K562 parental cell line and a stable transfection of HLA allele to better emulate native presentation. Our dataset includes five previously unprofiled alleles that expand MHC diversity in the training data and extend allelic coverage in underprofiled populations. To improve generalizability, SHERPA systematically integrates 128 monoallelic and 384 multiallelic samples with publicly available immunoproteomics data and binding assay data. Using this dataset, we developed two features that empirically estimate the propensities of genes and specific regions within gene bodies to engender immunopeptides to represent antigen processing. Using a composite model constructed with gradient boosting decision trees, multiallelic deconvolution, and 2.15 million peptides encompassing 167 alleles, we achieved a 1.44-fold improvement of positive predictive value compared with existing tools when evaluated on independent monoallelic datasets and a 1.17-fold improvement when evaluating on tumor samples. With a high degree of accuracy, SHERPA has the potential to enable precision neoantigen discovery for future clinical applications.

摘要

主要组织相容性复合体(MHC)结合的肽源于肿瘤特异性遗传改变,称为新抗原,是一类重要的抗癌治疗靶点。准确预测 MHC 复合物结合的肽是发现治疗相关新抗原的关键方面。基于质谱的免疫肽组学和先进建模技术的技术进步在过去 20 年中极大地提高了 MHC 呈递预测的准确性。然而,对于临床应用,如个性化癌症疫苗的开发、免疫治疗反应生物标志物的发现以及基因治疗中自身免疫风险的量化,需要提高预测算法的准确性。为此,我们使用 25 个单等位基因细胞系生成了等位基因特异性免疫肽组学数据,并创建了系统人类白细胞抗原(HLA)表位排序泛算法(SHERPA),这是一种用于预测 MHC-肽结合和呈递的泛等位基因 MHC-肽算法。与之前发表的大规模单等位基因数据不同,我们使用 HLA 缺失的 K562 亲本细胞系和 HLA 等位基因的稳定转染来更好地模拟天然呈递。我们的数据集包括五个以前未被 profiling 的等位基因,扩展了训练数据中的 MHC 多样性,并扩展了未被 profiling 人群中的等位基因覆盖范围。为了提高通用性,SHERPA 系统地整合了 128 个单等位基因和 384 个多等位基因样本,以及公开的免疫蛋白质组学数据和结合测定数据。使用这个数据集,我们开发了两个特征,经验性地估计了基因和基因体特定区域产生免疫肽以代表抗原加工的倾向。使用基于梯度提升决策树、多等位基因去卷积和包含 167 个等位基因的 215 万个肽的组合模型,我们在独立的单等位基因数据集上进行评估时,与现有工具相比,阳性预测值提高了 1.44 倍,在肿瘤样本上进行评估时,提高了 1.17 倍。SHERPA 具有很高的准确性,有可能为未来的临床应用实现精确的新抗原发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d25/10114598/229a6897f3f6/fx1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验