• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BMDD:用于准确估算零膨胀微生物组测序数据的概率框架。

BMDD: A Probabilistic Framework for Accurate Imputation of Zero-inflated Microbiome Sequencing Data.

作者信息

Zhou Huijuan, Chen Jun, Zhang Xianyang

机构信息

Shanghai University of Finance and Economics, Shanghai, China.

Mayo Clinic, Rochester, Minnesota, USA.

出版信息

bioRxiv. 2025 May 12:2025.05.08.652808. doi: 10.1101/2025.05.08.652808.

DOI:10.1101/2025.05.08.652808
PMID:40462952
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12132411/
Abstract

Microbiome sequencing data are inherently sparse and compositional, with excessive zeros arising from biological absence or insufficient sampling. These zeros pose significant challenges for downstream analyses, particularly those that require log-transformation. We introduce BMDD (BiModal Dirichlet Distribution), a novel probabilistic modeling framework for accurate imputation of microbiome sequencing data. Unlike existing imputation approaches that assume unimodal abundance, BMDD captures the bimodal abundance distribution of the taxa via a mixture of Dirichlet priors. It uses variational inference and a scalable expectation-maximization algorithm for efficient imputation. Through simulations and real microbiome datasets, we demonstrate that BMDD outperforms competing methods in reconstructing true abundances and improves the performance of differential abundance analysis. Through multiple posterior samples, BMDD enables robust inference by accounting for uncertainty in zero imputation. Our method offers a principled and computationally efficient solution for analyzing high-dimensional, zero-inflated microbiome sequencing data and is broadly applicable in microbial biomarker discovery and host-microbiome interaction studies. BMDD is available at: https://github.com/zhouhj1994/BMDD.

摘要

微生物组测序数据本质上是稀疏且具有组成性的,由于生物学缺失或采样不足会出现过多的零值。这些零值给下游分析带来了重大挑战,尤其是那些需要对数转换的分析。我们引入了BMDD(双峰狄利克雷分布),这是一种用于准确估算微生物组测序数据的新型概率建模框架。与现有的假设单峰丰度的估算方法不同,BMDD通过狄利克雷先验的混合来捕捉分类群的双峰丰度分布。它使用变分推理和可扩展的期望最大化算法进行高效估算。通过模拟和真实的微生物组数据集,我们证明BMDD在重建真实丰度方面优于竞争方法,并提高了差异丰度分析的性能。通过多个后验样本,BMDD通过考虑零值估算中的不确定性实现了稳健的推断。我们的方法为分析高维、零膨胀的微生物组测序数据提供了一种有原则且计算高效的解决方案,广泛适用于微生物生物标志物发现和宿主-微生物组相互作用研究。BMDD可在以下网址获取:https://github.com/zhouhj1994/BMDD。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/a9941ec7839e/nihpp-2025.05.08.652808v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/7eada75523d8/nihpp-2025.05.08.652808v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/abfee2c3675f/nihpp-2025.05.08.652808v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/71bac78f7682/nihpp-2025.05.08.652808v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/e424ddb8568a/nihpp-2025.05.08.652808v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/a9941ec7839e/nihpp-2025.05.08.652808v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/7eada75523d8/nihpp-2025.05.08.652808v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/abfee2c3675f/nihpp-2025.05.08.652808v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/71bac78f7682/nihpp-2025.05.08.652808v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/e424ddb8568a/nihpp-2025.05.08.652808v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a712/12132411/a9941ec7839e/nihpp-2025.05.08.652808v1-f0005.jpg

相似文献

1
BMDD: A Probabilistic Framework for Accurate Imputation of Zero-inflated Microbiome Sequencing Data.BMDD:用于准确估算零膨胀微生物组测序数据的概率框架。
bioRxiv. 2025 May 12:2025.05.08.652808. doi: 10.1101/2025.05.08.652808.
2
PbImpute: Precise Zero Discrimination and Balanced Imputation in Single-Cell RNA Sequencing Data.PbImpute:单细胞RNA测序数据中的精确零判别与平衡插补
J Chem Inf Model. 2025 Mar 10;65(5):2670-2684. doi: 10.1021/acs.jcim.4c02125. Epub 2025 Feb 17.
3
A Zero-Inflated Latent Dirichlet Allocation Model for Microbiome Studies.用于微生物组研究的零膨胀潜在狄利克雷分配模型。
Front Genet. 2021 Jan 22;11:602594. doi: 10.3389/fgene.2020.602594. eCollection 2020.
4
Transformation and differential abundance analysis of microbiome data incorporating phylogeny.整合系统发育信息的微生物组数据的转化和差异丰度分析。
Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543.
5
mbDenoise: microbiome data denoising using zero-inflated probabilistic principal components analysis.mbDenoise:使用零膨胀概率主成分分析的微生物组数据去噪
Genome Biol. 2022 Apr 14;23(1):94. doi: 10.1186/s13059-022-02657-3.
6
MarZIC: A Marginal Mediation Model for Zero-Inflated Compositional Mediators with Applications to Microbiome Data.MarZIC:一种用于零膨胀成分中介的边缘中介模型及其在微生物组数据中的应用。
Genes (Basel). 2022 Jun 11;13(6):1049. doi: 10.3390/genes13061049.
7
Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis.用于微生物组组成数据分析的零膨胀广义狄利克雷多项回归模型。
Biostatistics. 2019 Oct 1;20(4):698-713. doi: 10.1093/biostatistics/kxy025.
8
TphPMF: A microbiome data imputation method using hierarchical Bayesian Probabilistic Matrix Factorization.TphPMF:一种使用分层贝叶斯概率矩阵分解的微生物组数据插补方法。
PLoS Comput Biol. 2025 Mar 11;21(3):e1012858. doi: 10.1371/journal.pcbi.1012858. eCollection 2025 Mar.
9
Bayesian Generalized Linear Models for Analyzing Compositional and Sub-Compositional Microbiome Data via EM Algorithm.通过期望最大化算法分析成分和亚成分微生物组数据的贝叶斯广义线性模型
Stat Med. 2025 Mar 30;44(7):e70084. doi: 10.1002/sim.70084.
10
A zero inflated log-normal model for inference of sparse microbial association networks.零膨胀对数正态模型用于推断稀疏微生物关联网络。
PLoS Comput Biol. 2021 Jun 18;17(6):e1009089. doi: 10.1371/journal.pcbi.1009089. eCollection 2021 Jun.

本文引用的文献

1
Multigroup analysis of compositions of microbiomes with covariate adjustments and repeated measures.多群组分析带有协变量调整和重复测量的微生物组组成。
Nat Methods. 2024 Jan;21(1):83-91. doi: 10.1038/s41592-023-02092-7. Epub 2023 Dec 29.
2
Benchmarking differential abundance analysis methods for correlated microbiome sequencing data.基于相关微生物组测序数据的差异丰度分析方法的基准测试。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac607.
3
A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions.
微生物差异丰度分析方法的综合评估:现状与潜在解决方案。
Microbiome. 2022 Aug 19;10(1):130. doi: 10.1186/s40168-022-01320-0.
4
mbDenoise: microbiome data denoising using zero-inflated probabilistic principal components analysis.mbDenoise:使用零膨胀概率主成分分析的微生物组数据去噪
Genome Biol. 2022 Apr 14;23(1):94. doi: 10.1186/s13059-022-02657-3.
5
LinDA: linear models for differential abundance analysis of microbiome compositional data.LinDA:用于微生物组组成数据差异丰度分析的线性模型
Genome Biol. 2022 Apr 14;23(1):95. doi: 10.1186/s13059-022-02655-5.
6
RFtest: A Robust and Flexible Community-Level Test for Microbiome Data Powerfully Detects Phylogenetically Clustered Signals.RFtest:一种用于微生物组数据的强大且灵活的群落水平测试,可有效检测系统发育聚类信号。
Front Genet. 2022 Jan 24;12:749573. doi: 10.3389/fgene.2021.749573. eCollection 2021.
7
Performance determinants of unsupervised clustering methods for microbiome data.微生物组数据无监督聚类方法的性能决定因素。
Microbiome. 2022 Feb 5;10(1):25. doi: 10.1186/s40168-021-01199-3.
8
Zero-preserving imputation of single-cell RNA-seq data.单细胞 RNA-seq 数据的零保留插补。
Nat Commun. 2022 Jan 11;13(1):192. doi: 10.1038/s41467-021-27729-z.
9
Endometrial microbiota composition is associated with reproductive outcome in infertile patients.子宫内膜微生物组的组成与不孕患者的生殖结局有关。
Microbiome. 2022 Jan 4;10(1):1. doi: 10.1186/s40168-021-01184-w.
10
Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data.从 RNA-seq 数据构建基因共表达网络的稳健归一化和转换技术。
Genome Biol. 2022 Jan 3;23(1):1. doi: 10.1186/s13059-021-02568-9.