Declercq Arthur, Devreese Robbe, Scheid Jonas, Jachmann Caroline, Van Den Bossche Tim, Preikschat Annica, Gomez-Zepeda David, Rijal Jeewan Babu, Hirschler Aurélie, Krieger Jonathan R, Srikumar Tharan, Rosenberger George, Martelli Claudia, Trede Dennis, Carapito Christine, Tenzer Stefan, Walz Juliane S, Degroeve Sven, Bouwmeester Robbin, Martens Lennart, Gabriels Ralf
VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium.
Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium.
J Proteome Res. 2025 Mar 7;24(3):1067-1076. doi: 10.1021/acs.jproteome.4c00609. Epub 2025 Feb 6.
The high throughput analysis of proteins with mass spectrometry (MS) is highly valuable for understanding human biology, discovering disease biomarkers, identifying therapeutic targets, and exploring pathogen interactions. To achieve these goals, specialized proteomics subfields, including plasma proteomics, immunopeptidomics, and metaproteomics, must tackle specific analytical challenges, such as an increased identification ambiguity compared to routine proteomics experiments. Technical advancements in MS instrumentation can mitigate these issues by acquiring more discerning information at higher sensitivity levels. This is exemplified by the incorporation of ion mobility and parallel accumulation and serial fragmentation (PASEF) technologies in timsTOF instruments. In addition, AI-based bioinformatics solutions can help overcome ambiguity issues by integrating more data into the identification workflow. Here, we introduce TIMSRescore, a data-driven rescoring workflow optimized for DDA-PASEF data from timsTOF instruments. This platform includes new timsTOF MSPIP spectrum prediction models and IM2Deep, a new deep learning-based peptide ion mobility predictor. Furthermore, to fully streamline data throughput, TIMSRescore directly accepts Bruker raw mass spectrometry data and search results from ProteoScape and many other search engines, including Sage and PEAKS. We showcase TIMSRescore performance on plasma proteomics, immunopeptidomics (HLA class I and II), and metaproteomics data sets. TIMSRescore is open-source and freely available at https://github.com/compomics/tims2rescore.
利用质谱(MS)对蛋白质进行高通量分析,对于理解人类生物学、发现疾病生物标志物、确定治疗靶点以及探索病原体相互作用具有极高的价值。为实现这些目标,包括血浆蛋白质组学、免疫肽组学和宏蛋白质组学在内的专门蛋白质组学子领域,必须应对特定的分析挑战,例如与常规蛋白质组学实验相比,鉴定的不确定性增加。MS仪器的技术进步可以通过在更高灵敏度水平上获取更具辨别力的信息来缓解这些问题。timsTOF仪器中离子淌度与平行累积连续碎裂(PASEF)技术的结合就是一个例证。此外,基于人工智能的生物信息学解决方案可以通过将更多数据整合到鉴定工作流程中,帮助克服不确定性问题。在此,我们介绍TIMSRescore,这是一种针对来自timsTOF仪器的DDA-PASEF数据优化的数据驱动重评分工作流程。该平台包括新的timsTOF MSPIP谱图预测模型和IM2Deep,一种新的基于深度学习的肽离子淌度预测器。此外,为了全面简化数据通量,TIMSRescore直接接受布鲁克原始质谱数据以及来自ProteoScape和许多其他搜索引擎(包括Sage和PEAKS)的搜索结果。我们展示了TIMSRescore在血浆蛋白质组学、免疫肽组学(HLA I类和II类)以及宏蛋白质组学数据集上的性能。TIMSRescore是开源的,可在https://github.com/compomics/tims2rescore免费获取。