基于质谱的宏蛋白质组学中通过改进的期望最大化算法进行跨分类水平的生物学功能分配

Biological Function Assignment across Taxonomic Levels in Mass-Spectrometry-Based Metaproteomics via a Modified Expectation Maximization Algorithm.

作者信息

Alves Gelio, Ogurtsov Aleksey Y, Yu Yi-Kuo

机构信息

Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States.

出版信息

J Proteome Res. 2025 Aug 1;24(8):3818-3832. doi: 10.1021/acs.jproteome.4c01125. Epub 2025 Jul 18.

DOI:10.1021/acs.jproteome.4c01125

PMID:40679470

Abstract

A major challenge in mass-spectrometry-based metaproteomics is accurately identifying and quantifying biological functions across the full taxonomic lineage of microorganisms. This issue stems from what we refer to as the "shared confidently identified peptide problem″. To address this issue, most metaproteomics tools rely on the lowest common ancestor (LCA) algorithm to assign biological functions, which often leads to incomplete biological function assignments across the full taxonomic lineage of identified microorganisms. To overcome this limitation, we implemented an expectation-maximization (EM) algorithm, along with a biological function database, within the MiCId workflow. Using synthetic datasets, our study demonstrates that the enhanced MiCId workflow achieves better control over false discoveries and improved accuracy in microorganism identification and biomass estimation compared to Unipept and MetaGOmics. Additionally, the updated MiCId offers improved accuracy and better control of false discoveries in biological function identification compared to Unipept, along with reliable computation of function abundances across the full taxonomic lineage of identified microorganisms. Reanalyzing human oral and gut microbiome datasets using the enhanced MiCId workflow, we show that the results are consistent with those reported in the original publications, which were analyzed using the Galaxy-P platform with MEGAN5 and the MetaPro-IQ approach with Unipept, respectively.

摘要

基于质谱的宏蛋白质组学面临的一个主要挑战是，准确识别和量化微生物全部分类谱系中的生物学功能。这个问题源于我们所说的“共享的可靠鉴定肽问题”。为了解决这个问题，大多数宏蛋白质组学工具依靠最低共同祖先（LCA）算法来分配生物学功能，这往往导致在已鉴定微生物的全部分类谱系中生物学功能分配不完整。为了克服这一局限性，我们在MiCId工作流程中实施了期望最大化（EM）算法以及一个生物学功能数据库。通过合成数据集，我们的研究表明，与Unipept和MetaGOmics相比，增强后的MiCId工作流程在错误发现控制方面表现更好，在微生物鉴定和生物量估计方面准确性更高。此外，与Unipept相比，更新后的MiCId在生物学功能鉴定方面准确性更高，对错误发现的控制更好，同时能可靠地计算已鉴定微生物全部分类谱系中的功能丰度。使用增强后的MiCId工作流程重新分析人类口腔和肠道微生物组数据集，我们发现结果与原始出版物中报告的结果一致，原始出版物分别使用带有MEGAN5的Galaxy-P平台和带有Unipept的MetaPro-IQ方法进行分析。

相似文献

Biological Function Assignment across Taxonomic Levels in Mass-Spectrometry-Based Metaproteomics via a Modified Expectation Maximization Algorithm.基于质谱的宏蛋白质组学中通过改进的期望最大化算法进行跨分类水平的生物学功能分配

J Proteome Res. 2025 Aug 1;24(8):3818-3832. doi: 10.1021/acs.jproteome.4c01125. Epub 2025 Jul 18.

Biological Function Assignment Across Taxonomic Levels in Mass-Spectrometry-Based Metaproteomics via a Modified Expectation Maximization Algorithm.基于质谱的宏蛋白质组学中跨分类水平的生物功能分配：一种改进的期望最大化算法

bioRxiv. 2025 Jun 17:2025.06.12.659309. doi: 10.1101/2025.06.12.659309.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤

Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.

Interventions for promoting habitual exercise in people living with and beyond cancer.促进癌症患者及康复者进行习惯性锻炼的干预措施。

Cochrane Database Syst Rev. 2018 Sep 19;9(9):CD010192. doi: 10.1002/14651858.CD010192.pub3.

Data-Independent Acquisition Mass Spectrometry as a Tool for Metaproteomics: Interlaboratory Comparison Using a Model Microbiome.数据非依赖型采集质谱技术作为宏蛋白质组学的工具：使用模型微生物群落进行实验室间比较

Proteomics. 2025 May;25(9-10):e202400187. doi: 10.1002/pmic.202400187. Epub 2025 Apr 10.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Metaproteomics Reveals Community Coalescence Outcomes in Co-Cultured Human Gut Microbiota.宏蛋白质组学揭示共培养的人类肠道微生物群中的群落聚结结果。

Proteomics. 2025 Jul 26:e70009. doi: 10.1002/pmic.70009.

本文引用的文献

Operational Taxon-Function Framework in MetaX: Unveiling Taxonomic and Functional Associations in Metaproteomics.MetaX中的操作分类单元-功能框架：揭示宏蛋白质组学中的分类学和功能关联

Anal Chem. 2025 May 13;97(18):9739-9747. doi: 10.1021/acs.analchem.4c06645. Epub 2025 May 2.

UniProt: the Universal Protein Knowledgebase in 2025.通用蛋白质知识库（UniProt）：2025年的情况

Nucleic Acids Res. 2025 Jan 6;53(D1):D609-D617. doi: 10.1093/nar/gkae1010.

COG database update 2024.2024年COG数据库更新

Nucleic Acids Res. 2025 Jan 6;53(D1):D356-D363. doi: 10.1093/nar/gkae983.

Multiplexing the Identification of Microorganisms via Tandem Mass Tag Labeling Augmented by Interference Removal through a Novel Modification of the Expectation Maximization Algorithm.通过串联质量标签标记的多重化微生物鉴定，通过期望最大化算法的新改进去除干扰增强。

J Am Soc Mass Spectrom. 2024 Jun 5;35(6):1138-1155. doi: 10.1021/jasms.3c00445. Epub 2024 May 13.

The Landscape and Perspectives of the Human Gut Metaproteomics.人类肠道宏蛋白质组学的研究现状与展望。

Mol Cell Proteomics. 2024 May;23(5):100763. doi: 10.1016/j.mcpro.2024.100763. Epub 2024 Apr 10.

MiCId GUI: The Graphical User Interface for MiCId, a Fast Microorganism Classification and Identification Workflow with Accurate Statistics and High Recall.MiCId GUI：MiCId 的图形用户界面，这是一个快速的微生物分类和鉴定工作流程，具有准确的统计数据和高召回率。

J Comput Biol. 2024 Feb;31(2):175-178. doi: 10.1089/cmb.2023.0149. Epub 2024 Feb 2.

Introducing untargeted data-independent acquisition for metaproteomics of complex microbial samples.引入用于复杂微生物样本元蛋白质组学的非靶向数据非依赖采集法。

ISME Commun. 2022 Jun 29;2(1):51. doi: 10.1038/s43705-022-00137-0.

Parallelized Acquisition of Orbitrap and Astral Analyzers Enables High-Throughput Quantitative Analysis.轨道阱和星状分析仪的并行采集实现高通量定量分析。

Anal Chem. 2023 Oct 24;95(42):15656-15664. doi: 10.1021/acs.analchem.3c02856. Epub 2023 Oct 10.

Advances in the clinical use of metaproteomics.代谢蛋白质组学临床应用的进展。

Expert Rev Proteomics. 2023 Apr-Jun;20(4-6):71-86. doi: 10.1080/14789450.2023.2215440. Epub 2023 May 30.

Mix24X, a Lab-Assembled Reference to Evaluate Interpretation Procedures for Tandem Mass Spectrometry Proteotyping of Complex Samples.Mix24X，一种用于评估复杂样本串联质谱蛋白质组学分析解释程序的实验室组装参考品。

Int J Mol Sci. 2023 May 11;24(10):8634. doi: 10.3390/ijms24108634.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于质谱的宏蛋白质组学中通过改进的期望最大化算法进行跨分类水平的生物学功能分配

Biological Function Assignment across Taxonomic Levels in Mass-Spectrometry-Based Metaproteomics via a Modified Expectation Maximization Algorithm.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献