• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

tascCODA:成分扩增子和单细胞数据的贝叶斯树聚合分析

tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.

作者信息

Ostner Johannes, Carcy Salomé, Müller Christian L

机构信息

Department of Statistics, Ludwig-Maximilians-Universität München, Munich, Germany.

Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany.

出版信息

Front Genet. 2021 Dec 7;12:766405. doi: 10.3389/fgene.2021.766405. eCollection 2021.

DOI:10.3389/fgene.2021.766405
PMID:34950190
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8689185/
Abstract

Accurate generative statistical modeling of count data is of critical relevance for the analysis of biological datasets from high-throughput sequencing technologies. Important instances include the modeling of microbiome compositions from amplicon sequencing surveys and the analysis of cell type compositions derived from single-cell RNA sequencing. Microbial and cell type abundance data share remarkably similar statistical features, including their inherent compositionality and a natural hierarchical ordering of the individual components from taxonomic or cell lineage tree information, respectively. To this end, we introduce a Bayesian model for ree-aggregated mplicon and ingle-ell mpositional ata nalysis (tascCODA) that seamlessly integrates hierarchical information and experimental covariate data into the generative modeling of compositional count data. By combining latent parameters based on the tree structure with spike-and-slab Lasso penalization, tascCODA can determine covariate effects across different levels of the population hierarchy in a data-driven parsimonious way. In the context of differential abundance testing, we validate tascCODA's excellent performance on a comprehensive set of synthetic benchmark scenarios. Our analyses on human single-cell RNA-seq data from ulcerative colitis patients and amplicon data from patients with irritable bowel syndrome, respectively, identified aggregated cell type and taxon compositional changes that were more predictive and parsimonious than those proposed by other schemes. We posit that tascCODA constitutes a valuable addition to the growing statistical toolbox for generative modeling and analysis of compositional changes in microbial or cell population data.

摘要

对计数数据进行准确的生成式统计建模对于高通量测序技术的生物数据集分析至关重要。重要的实例包括来自扩增子测序调查的微生物组组成建模以及源自单细胞RNA测序的细胞类型组成分析。微生物和细胞类型丰度数据具有非常相似的统计特征,分别包括其固有的组成性以及来自分类学或细胞谱系树信息的各个成分的自然层次排序。为此,我们引入了一种用于重新聚合扩增子和单细胞位置数据分析的贝叶斯模型(tascCODA),该模型将层次信息和实验协变量数据无缝集成到组成计数数据的生成建模中。通过将基于树结构的潜在参数与尖峰和平板Lasso惩罚相结合,tascCODA可以以数据驱动的简约方式确定不同人群层次水平上的协变量效应。在差异丰度测试的背景下,我们在一组全面的合成基准场景中验证了tascCODA的出色性能。我们分别对溃疡性结肠炎患者的人类单细胞RNA-seq数据和肠易激综合征患者的扩增子数据进行分析,确定了聚集的细胞类型和分类群组成变化,这些变化比其他方案提出的变化更具预测性和简约性。我们认为tascCODA是不断增长的用于生成建模和分析微生物或细胞群体数据组成变化的统计工具箱中的一个有价值的补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/2406ac7aa95a/fgene-12-766405-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/68d3fb3da6c6/fgene-12-766405-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/437d8a4b8d85/fgene-12-766405-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/630dc7ec248a/fgene-12-766405-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/911105908d08/fgene-12-766405-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/2019e2dad404/fgene-12-766405-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/33421a5d99b5/fgene-12-766405-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/f5c833e2c722/fgene-12-766405-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/2406ac7aa95a/fgene-12-766405-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/68d3fb3da6c6/fgene-12-766405-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/437d8a4b8d85/fgene-12-766405-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/630dc7ec248a/fgene-12-766405-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/911105908d08/fgene-12-766405-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/2019e2dad404/fgene-12-766405-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/33421a5d99b5/fgene-12-766405-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/f5c833e2c722/fgene-12-766405-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0078/8689185/2406ac7aa95a/fgene-12-766405-g008.jpg

相似文献

1
tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.tascCODA:成分扩增子和单细胞数据的贝叶斯树聚合分析
Front Genet. 2021 Dec 7;12:766405. doi: 10.3389/fgene.2021.766405. eCollection 2021.
2
Tree-aggregated predictive modeling of microbiome data.基于树的微生物组数据预测模型构建。
Sci Rep. 2021 Jul 15;11(1):14505. doi: 10.1038/s41598-021-93645-3.
3
DIRICHLET-TREE MULTINOMIAL MIXTURES FOR CLUSTERING MICROBIOME COMPOSITIONS.用于微生物群落组成聚类的狄利克雷树多项混合模型
Ann Appl Stat. 2022 Sep;16(3):1476-1499. doi: 10.1214/21-aoas1552. Epub 2022 Jul 19.
4
Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis.统一高通量测序数据集的分析:通过组合数据分析描述 RNA-seq、16S rRNA 基因测序和选择性生长实验。
Microbiome. 2014 May 5;2:15. doi: 10.1186/2049-2618-2-15. eCollection 2014.
5
Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis.用于微生物组组成数据分析的零膨胀广义狄利克雷多项回归模型。
Biostatistics. 2019 Oct 1;20(4):698-713. doi: 10.1093/biostatistics/kxy025.
6
Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization.基于多项矩阵分解的微生物宏基因组测序数据的贝叶斯双聚类分析。
Biostatistics. 2022 Jul 18;23(3):891-909. doi: 10.1093/biostatistics/kxab002.
7
Correlation and association analyses in microbiome study integrating multiomics in health and disease.在健康和疾病的多组学整合微生物组研究中进行相关性和关联性分析。
Prog Mol Biol Transl Sci. 2020;171:309-491. doi: 10.1016/bs.pmbts.2020.04.003. Epub 2020 May 23.
8
Transformation and differential abundance analysis of microbiome data incorporating phylogeny.整合系统发育信息的微生物组数据的转化和差异丰度分析。
Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543.
9
Microbiome Datasets Are Compositional: And This Is Not Optional.微生物组数据集具有构成性:这并非可有可无。
Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. eCollection 2017.
10
A model for paired-multinomial data and its application to analysis of data on a taxonomic tree.一种配对多项数据模型及其在分类树数据分析中的应用。
Biometrics. 2017 Dec;73(4):1266-1278. doi: 10.1111/biom.12681. Epub 2017 Mar 30.

引用本文的文献

1
Variational inference for microbiome survey data with application to global ocean data.用于微生物群落调查数据的变分推断及其在全球海洋数据中的应用。
ISME Commun. 2025 May 2;5(1):ycaf062. doi: 10.1093/ismeco/ycaf062. eCollection 2025 Jan.
2
Score matching for differential abundance testing of compositional high-throughput sequencing data.用于成分高通量测序数据差异丰度检验的得分匹配法
bioRxiv. 2024 Dec 9:2024.12.05.627006. doi: 10.1101/2024.12.05.627006.
3
Analysis of Microbiome Data.微生物组数据分析

本文引用的文献

1
Subgradient ellipsoid method for nonsmooth convex problems.非光滑凸问题的次梯度椭球法
Math Program. 2023;199(1-2):305-341. doi: 10.1007/s10107-022-01833-4. Epub 2022 Jun 14.
2
Differential expression of single-cell RNA-seq data using Tweedie models.基于 Tweedie 模型的单细胞 RNA-seq 数据差异表达分析。
Stat Med. 2022 Aug 15;41(18):3492-3510. doi: 10.1002/sim.9430. Epub 2022 Jun 2.
3
LinDA: linear models for differential abundance analysis of microbiome compositional data.LinDA:用于微生物组组成数据差异丰度分析的线性模型
Annu Rev Stat Appl. 2024 Apr;11(1):483-504. doi: 10.1146/annurev-statistics-040522-120734. Epub 2023 Oct 13.
4
Tree-based differential testing using inferential uncertainty for RNA-Seq.使用推理不确定性进行RNA测序的基于树的差异测试。
bioRxiv. 2025 Feb 25:2023.12.25.573288. doi: 10.1101/2023.12.25.573288.
5
Best practices for single-cell analysis across modalities.多模态单细胞分析的最佳实践。
Nat Rev Genet. 2023 Aug;24(8):550-572. doi: 10.1038/s41576-023-00586-w. Epub 2023 Mar 31.
6
Negative binomial factor regression with application to microbiome data analysis.负二项因子回归及其在微生物组数据分析中的应用。
Stat Med. 2022 Jul 10;41(15):2786-2803. doi: 10.1002/sim.9384. Epub 2022 Apr 24.
Genome Biol. 2022 Apr 14;23(1):95. doi: 10.1186/s13059-022-02655-5.
4
scCODA is a Bayesian model for compositional single-cell data analysis.scCODA 是一种用于分析单细胞组成数据的贝叶斯模型。
Nat Commun. 2021 Nov 25;12(1):6876. doi: 10.1038/s41467-021-27150-6.
5
Learning sparse log-ratios for high-throughput sequencing data.学习高通量测序数据的稀疏对数比。
Bioinformatics. 2021 Dec 22;38(1):157-163. doi: 10.1093/bioinformatics/btab645.
6
A single-cell type transcriptomics map of human tissues.人类组织单细胞转录组图谱。
Sci Adv. 2021 Jul 28;7(31). doi: 10.1126/sciadv.abh2169. Print 2021 Jul.
7
Transformation and differential abundance analysis of microbiome data incorporating phylogeny.整合系统发育信息的微生物组数据的转化和差异丰度分析。
Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543.
8
Tree-aggregated predictive modeling of microbiome data.基于树的微生物组数据预测模型构建。
Sci Rep. 2021 Jul 15;11(1):14505. doi: 10.1038/s41598-021-93645-3.
9
Single-cell transcriptome profiling of an adult human cell atlas of 15 major organs.人类 15 大主要器官单细胞转录组图谱绘制。
Genome Biol. 2020 Dec 7;21(1):294. doi: 10.1186/s13059-020-02210-0.
10
Analysis of compositions of microbiomes with bias correction.具有偏置校正的微生物组组成分析。
Nat Commun. 2020 Jul 14;11(1):3514. doi: 10.1038/s41467-020-17041-7.