Sun Zhongzhi, Ning Zhibin, Cheng Kai, Duan Haonan, Wu Qing, Mayne Janice, Figeys Daniel
School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada.
Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada.
Comput Struct Biotechnol J. 2023 Aug 29;21:4228-4237. doi: 10.1016/j.csbj.2023.08.025. eCollection 2023.
Metaproteomics has increasingly been applied to study functional changes in the human gut microbiome. Peptide identification is an important step in metaproteomics research, with sequence database search (SDS) and spectral library search (SLS) as the two main methods to identify peptides. However, the large search space in metaproteomics studies causes significant challenges for both identification methods. Moreover, with the development of mass spectrometry, it is now feasible to perform metaproteomic projects involving 100-1000 individual microbiomes. These large-scale projects create a conundrum for searching large databases. In this study, we constructed MetaPep, a core peptide database (including both collections of peptide sequences and tandem MS spectra) greatly accelerating the peptide identifications. Raw files from fifteen metaproteomics projects were re-analyzed and the identified peptide-spectrum matches (PSMs) were used to construct the MetaPep database. The constructed MetaPep database achieved rapid and accurate identification of peptides for human gut metaproteomics. MetaPep has a large collection of peptides and spectra that have been identified in published human gut metaproteomics datasets. MetaPep database can be used as an important resource in the current stage of human gut metaproteomics research. This study showed the possibility of applying a core peptide database as a generic metaproteomics workflow. MetaPep could also be an important resource for future human gut metaproteomics research, such as DIA (data-independent acquisition) analysis.
宏蛋白质组学已越来越多地应用于研究人类肠道微生物群的功能变化。肽段鉴定是宏蛋白质组学研究中的一个重要步骤,序列数据库搜索(SDS)和谱图库搜索(SLS)是鉴定肽段的两种主要方法。然而,宏蛋白质组学研究中庞大的搜索空间给这两种鉴定方法都带来了重大挑战。此外,随着质谱技术的发展,开展涉及100 - 1000个个体微生物群的宏蛋白质组学项目现在已可行。这些大规模项目给搜索大型数据库带来了难题。在本研究中,我们构建了MetaPep,这是一个核心肽段数据库(包括肽段序列和串联质谱图谱集合),能极大地加速肽段鉴定。对来自15个宏蛋白质组学项目的原始文件进行了重新分析,并将鉴定出的肽段谱匹配(PSM)用于构建MetaPep数据库。构建的MetaPep数据库实现了对人类肠道宏蛋白质组学肽段的快速准确鉴定。MetaPep拥有大量已在已发表的人类肠道宏蛋白质组学数据集中鉴定出的肽段和图谱。MetaPep数据库可作为人类肠道宏蛋白质组学研究现阶段的重要资源。本研究展示了应用核心肽段数据库作为通用宏蛋白质组学工作流程的可能性。MetaPep也可能成为未来人类肠道宏蛋白质组学研究(如数据非依赖采集(DIA)分析)的重要资源。