Suppr超能文献

使用真实微生物组数据集对宏蛋白质组学的光谱库和数据库搜索方法进行基准测试。

Benchmarking Spectral Library and Database Search Approaches for Metaproteomics Using a Ground-Truth Microbiome Dataset.

作者信息

Rajczewski Andrew T, Mehta Subina, Wagner Reid, Gabriel Wassim, Johnson James, Do Katherine, Vintila Simina, Wilhelm Mathias, Kleiner Manuel, Searle Brian C, Griffin Timothy J, Jagtap Pratik D

机构信息

University of Minnesota, Minneapolis, MN.

Computational Mass Spectrometry, Technical University of Munich, Freising, Germany.

出版信息

bioRxiv. 2025 May 20:2025.05.15.654320. doi: 10.1101/2025.05.15.654320.

Abstract

Mass spectrometry-based metaproteomics, the identification and quantification of thousands of proteins expressed by complex microbial communities, has become pivotal for unraveling functional interactions within microbiomes. However, metaproteomics data analysis encounters many challenges, including the search of tandem mass spectra against a protein sequence database using proteomics database search algorithms. We used a ground-truth dataset to assess a spectral library searching method against established database searching approaches. Mass spectrometry data collected by data-dependent acquisition (DDA-MS) was analyzed using database searching approaches (MaxQuant and FragPipe), as well as using Scribe with Prosit predicted spectral libraries. We used FASTA databases that included protein sequences from microbial species present in the ground-truth dataset along with background protein sequences, to estimate error rates and assess the effects on detection, peptide-spectral match quality, and quantification. Using the Scribe search engine resulted in more proteins detected at a 1% false discovery rate (FDR) compared to MaxQuant or FragPipe, while FragPipe detected more peptides verified by PepQuery. Scribe was able to detect more low-abundance proteins in the microbiome dataset and was more accurate in quantifying the microbial community composition. This research provides insights and guidance for metaproteomics researchers aiming to optimize results in their analysis of DDA-MS data.

摘要

基于质谱的宏蛋白质组学,即对复杂微生物群落表达的数千种蛋白质进行鉴定和定量,已成为揭示微生物组内功能相互作用的关键。然而,宏蛋白质组学数据分析面临许多挑战,包括使用蛋白质组学数据库搜索算法在蛋白质序列数据库中搜索串联质谱。我们使用了一个真实数据集,以评估一种光谱库搜索方法与既定的数据库搜索方法。通过数据依赖采集(DDA-MS)收集的质谱数据使用数据库搜索方法(MaxQuant和FragPipe)进行分析,以及使用带有Prosit预测光谱库的Scribe进行分析。我们使用了FASTA数据库,其中包括真实数据集中存在的微生物物种的蛋白质序列以及背景蛋白质序列,以估计错误率并评估对检测、肽-光谱匹配质量和定量的影响。与MaxQuant或FragPipe相比,使用Scribe搜索引擎在1%的错误发现率(FDR)下检测到更多蛋白质,而FragPipe检测到更多经PepQuery验证的肽段。Scribe能够在微生物组数据集中检测到更多低丰度蛋白质,并且在定量微生物群落组成方面更准确。这项研究为旨在优化DDA-MS数据分析结果的宏蛋白质组学研究人员提供了见解和指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2d79/12139738/475fdee9a2be/nihpp-2025.05.15.654320v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验