• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计数:分子生物科学中成分数据对数比分析的一项突出挑战。

Counts: an outstanding challenge for log-ratio analysis of compositional data in the molecular biosciences.

作者信息

Lovell David R, Chua Xin-Yi, McGrath Annette

机构信息

Queensland University of Technology, Australia.

Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia.

出版信息

NAR Genom Bioinform. 2020 Jun 19;2(2):lqaa040. doi: 10.1093/nargab/lqaa040. eCollection 2020 Jun.

DOI:10.1093/nargab/lqaa040
PMID:33575593
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7671413/
Abstract

Thanks to sequencing technology, modern molecular bioscience datasets are often compositions of counts, e.g. counts of amplicons, mRNAs, etc. While there is growing appreciation that compositional data need special analysis and interpretation, less well understood is the discrete nature of these count compositions (or, as we call them, lattice compositions) and the impact this has on statistical analysis, particularly log-ratio analysis (LRA) of pairwise association. While LRA methods are scale-invariant, count compositional data are not; consequently, the conclusions we draw from LRA of lattice compositions depend on the scale of counts involved. We know that additive variation affects the relative abundance of small counts more than large counts; here we show that additive (quantization) variation comes from the discrete nature of count data itself, as well as (biological) variation in the system under study and (technical) variation from measurement and analysis processes. Variation due to quantization is inevitable, but its impact on conclusions depends on the underlying scale and distribution of counts. We illustrate the different distributions of real molecular bioscience data from different experimental settings to show why it is vital to understand the distributional characteristics of count data before applying and drawing conclusions from compositional data analysis methods.

摘要

得益于测序技术,现代分子生物科学数据集通常是计数的组合,例如扩增子、mRNA等的计数。虽然人们越来越认识到组合数据需要特殊的分析和解释,但这些计数组合(或者我们所称的格点组合)的离散性质以及这对统计分析,特别是成对关联的对数比率分析(LRA)的影响却鲜为人知。虽然LRA方法是尺度不变的,但计数组合数据并非如此;因此,我们从格点组合的LRA得出的结论取决于所涉及计数的尺度。我们知道,加性变异对小计数相对丰度的影响大于大计数;在这里我们表明,加性(量化)变异来自计数数据本身的离散性质,以及所研究系统中的(生物学)变异和测量与分析过程中的(技术)变异。量化引起的变异是不可避免的,但其对结论的影响取决于计数的潜在尺度和分布。我们展示了来自不同实验设置的真实分子生物科学数据的不同分布,以说明在应用组合数据分析方法并得出结论之前了解计数数据的分布特征为何至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/f0c2a062c1cd/lqaa040fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/0f517261533e/lqaa040fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/0fc774151052/lqaa040fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/c8b7f1f79e4e/lqaa040fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/35890eda7131/lqaa040fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/a5652b3e81f2/lqaa040fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/bd96fa29b88f/lqaa040fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/b7bbb01db935/lqaa040fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/f0c2a062c1cd/lqaa040fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/0f517261533e/lqaa040fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/0fc774151052/lqaa040fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/c8b7f1f79e4e/lqaa040fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/35890eda7131/lqaa040fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/a5652b3e81f2/lqaa040fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/bd96fa29b88f/lqaa040fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/b7bbb01db935/lqaa040fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea4a/7671413/f0c2a062c1cd/lqaa040fig8.jpg

相似文献

1
Counts: an outstanding challenge for log-ratio analysis of compositional data in the molecular biosciences.计数:分子生物科学中成分数据对数比分析的一项突出挑战。
NAR Genom Bioinform. 2020 Jun 19;2(2):lqaa040. doi: 10.1093/nargab/lqaa040. eCollection 2020 Jun.
2
Compositional Data Analysis of Microbiome and Any-Omics Datasets: A Validation of the Additive Logratio Transformation.微生物组与任意组学数据集的成分数据分析:加法对数比变换的验证
Front Microbiol. 2021 Oct 11;12:727398. doi: 10.3389/fmicb.2021.727398. eCollection 2021.
3
Understanding sequencing data as compositions: an outlook and review.理解测序数据作为组成:展望与回顾。
Bioinformatics. 2018 Aug 15;34(16):2870-2878. doi: 10.1093/bioinformatics/bty175.
4
Microbiome Datasets Are Compositional: And This Is Not Optional.微生物组数据集具有构成性:这并非可有可无。
Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. eCollection 2017.
5
Compositional data in neuroscience: If you've got it, log it!神经科学中的成分数据:如果你得到了它,就记录下来!
J Neurosci Methods. 2016 Sep 15;271:154-9. doi: 10.1016/j.jneumeth.2016.07.008. Epub 2016 Jul 20.
6
Interpretation of associations between the accelerometry physical activity spectrum and cardiometabolic health and locomotor skills in two cohorts of children using raw, normalized, log-transformed, or compositional data.使用原始、归一化、对数转换或成分数据解释两个儿童队列中加速度计体力活动谱与心脏代谢健康和运动技能之间的关联。
J Sports Sci. 2020 Dec;38(23):2708-2719. doi: 10.1080/02640414.2020.1796462. Epub 2020 Jul 29.
7
Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies.大规模基准测试揭示了微生物组研究中使用的 16S rRNA 基因扩增子数据分析方法中的假发现和计数转换敏感性。
Microbiome. 2016 Nov 25;4(1):62. doi: 10.1186/s40168-016-0208-8.
8
General models for resource use or other compositional count data using the Dirichlet-multinomial distribution.使用狄利克雷-多项分布的资源利用或其他组合计数数据的通用模型。
Ecology. 2013 Dec;94(12):2678-87. doi: 10.1890/12-0416.1.
9
Normalization and microbial differential abundance strategies depend upon data characteristics.归一化和微生物差异丰度策略取决于数据特征。
Microbiome. 2017 Mar 3;5(1):27. doi: 10.1186/s40168-017-0237-y.
10
A distance based multisample test for high-dimensional compositional data with applications to the human microbiome.基于距离的多维组合数据多样本检验及其在人类微生物组中的应用。
BMC Bioinformatics. 2020 Dec 3;21(Suppl 9):205. doi: 10.1186/s12859-020-3530-x.

引用本文的文献

1
Whole blood transcriptomics analysis of Indonesians reveals translocated and pathogenic microbiota in blood.对印度尼西亚人的全血转录组学分析揭示了血液中易位和致病的微生物群。
PLoS One. 2025 Jul 24;20(7):e0328788. doi: 10.1371/journal.pone.0328788. eCollection 2025.
2
Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome.大规模整合的共表达分析揭示了酵母非经典翻译组的转录调控、进化和细胞意义。
Genome Biol. 2024 Jul 8;25(1):183. doi: 10.1186/s13059-024-03287-7.
3
Bayesian inference of relative fitness on high-throughput pooled competition assays.

本文引用的文献

1
Naught all zeros in sequence count data are the same.序列计数数据中的零并非都相同。
Comput Struct Biotechnol J. 2020 Sep 28;18:2789-2798. doi: 10.1016/j.csbj.2020.09.014. eCollection 2020.
2
A field guide for the compositional analysis of any-omics data.任何组学数据的组成分析指南。
Gigascience. 2019 Sep 1;8(9). doi: 10.1093/gigascience/giz107.
3
Dynamic linear models guide design and analysis of microbiota studies within artificial human guts.动态线性模型指导人工肠道内微生物组研究的设计和分析。
基于高通量 pooled 竞争测定的相对适合度的贝叶斯推断。
PLoS Comput Biol. 2024 Mar 15;20(3):e1011937. doi: 10.1371/journal.pcbi.1011937. eCollection 2024 Mar.
4
Bayesian inference of relative fitness on high-throughput pooled competition assays.基于高通量混合竞争试验的相对适合度的贝叶斯推断。
bioRxiv. 2023 Oct 18:2023.10.14.562365. doi: 10.1101/2023.10.14.562365.
5
Assessing arthropod diversity metrics derived from stream environmental DNA: spatiotemporal variation and paired comparisons with manual sampling.评估基于溪流环境 DNA 的节肢动物多样性指标:时空变化及与手动采样的配对比较。
PeerJ. 2023 Mar 31;11:e15163. doi: 10.7717/peerj.15163. eCollection 2023.
6
Pairwise ratio-based differential abundance analysis of infant microbiome 16S sequencing data.基于成对比率的婴儿微生物组16S测序数据差异丰度分析。
NAR Genom Bioinform. 2023 Jan 20;5(1):lqad001. doi: 10.1093/nargab/lqad001. eCollection 2023 Mar.
7
A Simultaneous Feature Selection and Compositional Association Test for Detecting Sparse Associations in High-Dimensional Metagenomic Data.一种用于检测高维宏基因组数据中稀疏关联的同时特征选择与成分关联测试
Front Microbiol. 2022 Mar 21;13:837396. doi: 10.3389/fmicb.2022.837396. eCollection 2022.
8
Increasing transparency and reproducibility in stroke-microbiota research: A toolbox for microbiota analysis.提高中风-微生物群研究的透明度和可重复性:微生物群分析工具箱
iScience. 2022 Feb 26;25(4):103998. doi: 10.1016/j.isci.2022.103998. eCollection 2022 Apr 15.
9
Editorial: Compositional data analysis and related methods applied to genomics-a first special issue from .社论:应用于基因组学的成分数据分析及相关方法——来自……的首个特刊
NAR Genom Bioinform. 2020 Dec 9;2(4):lqaa103. doi: 10.1093/nargab/lqaa103. eCollection 2020 Dec.
Microbiome. 2018 Nov 12;6(1):202. doi: 10.1186/s40168-018-0584-3.
4
Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches.对去噪器进行去噪:微生物组序列错误校正方法的独立评估。
PeerJ. 2018 Aug 8;6:e5364. doi: 10.7717/peerj.5364. eCollection 2018.
5
Distinct patterns and processes of abundant and rare eukaryotic plankton communities following a reservoir cyanobacterial bloom.水库蓝藻水华后丰富和稀有真核浮游生物群落的不同模式和过程。
ISME J. 2018 Sep;12(9):2263-2277. doi: 10.1038/s41396-018-0159-0. Epub 2018 Jun 13.
6
Allometry and Ecology of the Bilaterian Gut Microbiome.后生动物肠道微生物组的异速生长和生态学。
mBio. 2018 Mar 27;9(2):e00319-18. doi: 10.1128/mBio.00319-18.
7
Microbiome Datasets Are Compositional: And This Is Not Optional.微生物组数据集具有构成性:这并非可有可无。
Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. eCollection 2017.
8
propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis.propr:一个使用成分数据分析识别比例丰富特征的 R 包。
Sci Rep. 2017 Nov 24;7(1):16252. doi: 10.1038/s41598-017-16520-0.
9
Environmental DNA metabarcoding: Transforming how we survey animal and plant communities.环境DNA宏条形码技术:变革我们对动植物群落的调查方式。
Mol Ecol. 2017 Nov;26(21):5872-5895. doi: 10.1111/mec.14350. Epub 2017 Oct 26.
10
Where less may be more: how the rare biosphere pulls ecosystems strings.少即是多:稀有生物圈如何操纵生态系统。
ISME J. 2017 Apr;11(4):853-862. doi: 10.1038/ismej.2016.174. Epub 2017 Jan 10.