Suppr超能文献

微生物16S基因与鸟枪法宏基因组测序数据的整合分析提高了统计效率。

Integrative analysis of microbial 16S gene and shotgun metagenomic sequencing data improves statistical efficiency.

作者信息

Yue Ye, Read Timothy D, Fedirko Veronika, Satten Glen A, Hu Yi-Juan

机构信息

Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA.

Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, Atlanta, GA, 30322, USA.

出版信息

Res Sq. 2023 Oct 3:rs.3.rs-3376801. doi: 10.21203/rs.3.rs-3376801/v1.

Abstract

BACKGROUND

The most widely used technologies for profiling microbial communities are 16S marker-gene sequencing and shotgun metagenomic sequencing. Interestingly, many microbiome studies have performed both sequencing experiments on the same cohort of samples. The two sequencing datasets often reveal consistent patterns of microbial signatures, highlighting the potential for an integrative analysis to improve power of testing these signatures. However, differential experimental biases, partially overlapping samples, and differential library sizes pose tremendous challenges when combining the two datasets. Currently, researchers either discard one dataset entirely or use different datasets for different objectives.

METHODS

In this article, we introduce the first method of this kind, named Com-2seq, that combines the two sequencing datasets for testing differential abundance at the genus and community levels while overcoming these difficulties. The new method is based on our LOCOM model (Hu et al., 2022), which employs logistic regression for testing taxon differential abundance while remaining robust to experimental bias. To benchmark the performance of Com-2seq, we introduce two approaches: applying LOCOM to pooled taxa count data and combining LOCOM -values from analyzing each dataset separately.

RESULTS

Our simulation studies indicate that Com-2seq substantially improves statistical efficiency over analysis of either dataset alone and works better than the two approaches. An application of Com-2seq to two real microbiome studies uncovered scientifically plausible findings that would have been missed by analyzing individual datasets.

CONCLUSIONS

Com-2seq performs integrative analysis of 16S and metagenomic sequencing data, which improves statistical efficiency and has the potential to accelerate the search of microbial communities and taxa that are involved in human health and diseases.

摘要

背景

用于分析微生物群落的最广泛使用的技术是16S标记基因测序和鸟枪法宏基因组测序。有趣的是,许多微生物组研究对同一批样本进行了这两种测序实验。这两个测序数据集通常揭示出一致的微生物特征模式,突出了综合分析在提高检测这些特征的功效方面的潜力。然而,差异实验偏差、部分重叠的样本以及不同的文库大小在合并这两个数据集时带来了巨大挑战。目前,研究人员要么完全舍弃一个数据集,要么针对不同目标使用不同的数据集。

方法

在本文中,我们介绍了第一种此类方法,名为Com-2seq,它结合了这两个测序数据集,用于在属和群落水平上检测差异丰度,同时克服了这些困难。新方法基于我们的LOCOM模型(Hu等人,2022年),该模型采用逻辑回归来检测分类单元差异丰度,同时对实验偏差保持稳健。为了评估Com-2seq的性能,我们引入了两种方法:将LOCOM应用于合并的分类单元计数数据,以及将分别分析每个数据集得到的LOCOM值进行合并。

结果

我们的模拟研究表明,Com-2seq比单独分析任何一个数据集都能显著提高统计效率,并且比这两种方法的效果更好。将Com-2seq应用于两项实际的微生物组研究中,发现了单独分析单个数据集时会遗漏的科学合理的结果。

结论

Com-2seq对16S和宏基因组测序数据进行综合分析,提高了统计效率,有潜力加速对参与人类健康和疾病的微生物群落和分类单元的探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ac/10602108/a2fad60504de/nihpp-rs3376801v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验