• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于成分高通量测序数据差异丰度检验的得分匹配法

Score matching for differential abundance testing of compositional high-throughput sequencing data.

作者信息

Ostner Johannes, Li Hongzhe, Müller Christian L

机构信息

Computational Health Center, Helmholtz Munich, Neuherberg, Germany.

Institut für Statistik, Ludwig-Maximilians-Universität München, Munich, Germany.

出版信息

bioRxiv. 2024 Dec 9:2024.12.05.627006. doi: 10.1101/2024.12.05.627006.

DOI:10.1101/2024.12.05.627006
PMID:39713439
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11661129/
Abstract

The class of a-b power interaction models, proposed by Yu et al. (2024), provides a general framework for modeling sparse compositional count data with pairwise feature interactions. This class includes many distributions as special cases and enables zero count handling through power transformations, making it especially suitable for modern high- throughput sequencing data with excess zeros, including single-cell RNA-Seq and amplicon sequencing data. Here, we present an extension of this class of models that can include covariate information, allowing for accurate characterization of covariate dependencies in heterogeneous populations. Combining this model with a tailored differential abundance (DA) test leads to a novel DA testing scheme, cosmoDA, that can reduce false positive detection caused by correlated features. cosmoDA uses the generalized score matching estimation framework for power interaction models Our benchmarks on simulated and real data show that cosmoDA can accurately estimate feature interactions in the presence of population heterogeneity and significantly reduces the false discovery rate when testing for differential abundance of correlated features. Finally, cosmoDA provides an explicit link to popular Box-Cox-type data transformations and allows to assess the impact of zero replacement and power transformations on downstream differential abundance results. cosmoDA is available at https://github.com/bio-datascience/cosmoDA.

摘要

Yu等人(2024年)提出的α-β幂交互作用模型类别,为具有成对特征交互作用的稀疏成分计数数据建模提供了一个通用框架。该类别包含许多特殊情况下的分布,并通过幂变换实现零计数处理,使其特别适用于具有过多零值的现代高通量测序数据,包括单细胞RNA测序和扩增子测序数据。在此,我们提出了这一模型类别的扩展,它可以纳入协变量信息,从而能够准确表征异质群体中的协变量依赖性。将该模型与定制的差异丰度(DA)检验相结合,产生了一种新颖的DA检验方案cosmoDA,它可以减少由相关特征导致的误阳性检测。cosmoDA使用幂交互作用模型的广义得分匹配估计框架。我们在模拟数据和真实数据上的基准测试表明,cosmoDA能够在存在群体异质性的情况下准确估计特征交互作用,并在测试相关特征的差异丰度时显著降低错误发现率。最后,cosmoDA提供了与流行的Box-Cox型数据变换的明确联系,并允许评估零替换和幂变换对下游差异丰度结果的影响。cosmoDA可在https://github.com/bio-datascience/cosmoDA获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b47/11661129/c77705be0887/nihpp-2024.12.05.627006v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b47/11661129/242b5388b285/nihpp-2024.12.05.627006v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b47/11661129/c77705be0887/nihpp-2024.12.05.627006v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b47/11661129/242b5388b285/nihpp-2024.12.05.627006v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b47/11661129/c77705be0887/nihpp-2024.12.05.627006v1-f0005.jpg

相似文献

1
Score matching for differential abundance testing of compositional high-throughput sequencing data.用于成分高通量测序数据差异丰度检验的得分匹配法
bioRxiv. 2024 Dec 9:2024.12.05.627006. doi: 10.1101/2024.12.05.627006.
2
tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.tascCODA:成分扩增子和单细胞数据的贝叶斯树聚合分析
Front Genet. 2021 Dec 7;12:766405. doi: 10.3389/fgene.2021.766405. eCollection 2021.
3
Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis.统一高通量测序数据集的分析:通过组合数据分析描述 RNA-seq、16S rRNA 基因测序和选择性生长实验。
Microbiome. 2014 May 5;2:15. doi: 10.1186/2049-2618-2-15. eCollection 2014.
4
Transformation and differential abundance analysis of microbiome data incorporating phylogeny.整合系统发育信息的微生物组数据的转化和差异丰度分析。
Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543.
5
A Novel Slope-Matrix-Graph Algorithm to Analyze Compositional Microbiome Data.一种用于分析微生物群落组成数据的新型斜率矩阵图算法。
Microorganisms. 2024 Sep 9;12(9):1866. doi: 10.3390/microorganisms12091866.
6
Microbial Networks in SPRING - Semi-parametric Rank-Based Correlation and Partial Correlation Estimation for Quantitative Microbiome Data.SPRING中的微生物网络——用于定量微生物组数据的基于半参数秩的相关性和偏相关性估计
Front Genet. 2019 Jun 6;10:516. doi: 10.3389/fgene.2019.00516. eCollection 2019.
7
Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods.RNA-Seq 差异表达分析工具的基准测试:基于标准化与基于对数比变换的方法。
BMC Bioinformatics. 2018 Jul 18;19(1):274. doi: 10.1186/s12859-018-2261-8.
8
A maximum-type microbial differential abundance test with application to high-dimensional microbiome data analyses.一种基于最大似然的微生物差异丰度检验方法及其在高维微生物组数据分析中的应用。
Front Cell Infect Microbiol. 2022 Oct 28;12:988717. doi: 10.3389/fcimb.2022.988717. eCollection 2022.
9
Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies.大规模基准测试揭示了微生物组研究中使用的 16S rRNA 基因扩增子数据分析方法中的假发现和计数转换敏感性。
Microbiome. 2016 Nov 25;4(1):62. doi: 10.1186/s40168-016-0208-8.
10
An adaptive direction-assisted test for microbiome compositional data.一种微生物组组成数据的自适应方向辅助检验方法。
Bioinformatics. 2022 Jul 11;38(14):3493-3500. doi: 10.1093/bioinformatics/btac361.

引用本文的文献

1
The interplay between motor cost and self-efficacy related to walking across terrain in gaze and walking decisions.在注视和行走决策中,运动成本与穿越地形行走相关的自我效能之间的相互作用。
Sci Rep. 2024 Dec 28;14(1):31040. doi: 10.1038/s41598-024-82185-1.

本文引用的文献

1
Probe-based bacterial single-cell RNA sequencing predicts toxin regulation.基于探针的细菌单细胞 RNA 测序预测毒素调控。
Nat Microbiol. 2023 May;8(5):934-945. doi: 10.1038/s41564-023-01348-4. Epub 2023 Apr 3.
2
Best practices for single-cell analysis across modalities.多模态单细胞分析的最佳实践。
Nat Rev Genet. 2023 Aug;24(8):550-572. doi: 10.1038/s41576-023-00586-w. Epub 2023 Mar 31.
3
The maximum entropy principle for compositional data.组合数据的最大熵原理。
BMC Bioinformatics. 2022 Oct 29;23(1):449. doi: 10.1186/s12859-022-05007-z.
4
Generalized score matching for general domains.通用领域的广义得分匹配。
Inf inference. 2021 Jan 25;11(2):739-780. doi: 10.1093/imaiai/iaaa041. eCollection 2022 Jun.
5
Negative binomial factor regression with application to microbiome data analysis.负二项因子回归及其在微生物组数据分析中的应用。
Stat Med. 2022 Jul 10;41(15):2786-2803. doi: 10.1002/sim.9384. Epub 2022 Apr 24.
6
LinDA: linear models for differential abundance analysis of microbiome compositional data.LinDA:用于微生物组组成数据差异丰度分析的线性模型
Genome Biol. 2022 Apr 14;23(1):95. doi: 10.1186/s13059-022-02655-5.
7
Single-cell RNA-seq reveals cell type-specific molecular and genetic associations to lupus.单细胞 RNA 测序揭示了狼疮相关的细胞类型特异性分子和遗传关联。
Science. 2022 Apr 8;376(6589):eabf1970. doi: 10.1126/science.abf1970.
8
Microbiome differential abundance methods produce different results across 38 datasets.微生物组差异丰度方法在 38 个数据集上产生了不同的结果。
Nat Commun. 2022 Jan 17;13(1):342. doi: 10.1038/s41467-022-28034-z.
9
tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.tascCODA:成分扩增子和单细胞数据的贝叶斯树聚合分析
Front Genet. 2021 Dec 7;12:766405. doi: 10.3389/fgene.2021.766405. eCollection 2021.
10
scCODA is a Bayesian model for compositional single-cell data analysis.scCODA 是一种用于分析单细胞组成数据的贝叶斯模型。
Nat Commun. 2021 Nov 25;12(1):6876. doi: 10.1038/s41467-021-27150-6.