Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium.
Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.
Proteomics. 2024 Apr;24(8):e2300336. doi: 10.1002/pmic.202300336. Epub 2023 Nov 27.
Immunopeptidomics is a key technology in the discovery of targets for immunotherapy and vaccine development. However, identifying immunopeptides remains challenging due to their non-tryptic nature, which results in distinct spectral characteristics. Moreover, the absence of strict digestion rules leads to extensive search spaces, further amplified by the incorporation of somatic mutations, pathogen genomes, unannotated open reading frames, and post-translational modifications. This inflation in search space leads to an increase in random high-scoring matches, resulting in fewer identifications at a given false discovery rate. Peptide-spectrum match rescoring has emerged as a machine learning-based solution to address challenges in mass spectrometry-based immunopeptidomics data analysis. It involves post-processing unfiltered spectrum annotations to better distinguish between correct and incorrect peptide-spectrum matches. Recently, features based on predicted peptidoform properties, including fragment ion intensities, retention time, and collisional cross section, have been used to improve the accuracy and sensitivity of immunopeptide identification. In this review, we describe the diverse bioinformatics pipelines that are currently available for peptide-spectrum match rescoring and discuss how they can be used for the analysis of immunopeptidomics data. Finally, we provide insights into current and future machine learning solutions to boost immunopeptide identification.
免疫肽组学是免疫治疗和疫苗开发靶标发现的关键技术。然而,由于免疫肽的非胰蛋白酶特性,导致其具有独特的光谱特征,因此识别免疫肽仍然具有挑战性。此外,由于缺乏严格的消化规则,导致搜索空间广泛,再加上体细胞突变、病原体基因组、未注释的开放阅读框和翻译后修饰的加入,进一步扩大了搜索空间。这种搜索空间的膨胀导致随机高分匹配的增加,从而在给定的错误发现率下,鉴定的结果更少。肽谱匹配重评分已成为基于机器学习的解决方案,用于解决基于质谱的免疫肽组学数据分析中的挑战。它涉及对未过滤的光谱注释进行后处理,以更好地区分正确和不正确的肽谱匹配。最近,基于预测的肽形式特性的特征,包括片段离子强度、保留时间和碰撞截面,已被用于提高免疫肽鉴定的准确性和灵敏度。在这篇综述中,我们描述了目前可用于肽谱匹配重评分的各种生物信息学管道,并讨论了它们如何用于免疫肽组学数据的分析。最后,我们提供了对当前和未来机器学习解决方案的洞察,以提高免疫肽的鉴定。