Suppr超能文献

一种针对时间序列宏基因组测序数据的差异丰度分析的信息性方法。

An informative approach on differential abundance analysis for time-course metagenomic sequencing data.

作者信息

Luo Dan, Ziebell Sara, An Lingling

机构信息

Department of Epidemiology and Biostatistics, College of Public Health.

Interdisciplinary Program in Statistics.

出版信息

Bioinformatics. 2017 May 1;33(9):1286-1292. doi: 10.1093/bioinformatics/btw828.

Abstract

MOTIVATION

The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support.

RESULTS

Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type.

AVAILABILITY AND IMPLEMENTATION

R code and an example dataset are available at https://cals.arizona.edu/∼anling/sbg/software.htm.

CONTACT

anling@email.arizona.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量下一代测序技术的出现极大地推动了宏基因组学领域的发展,在该领域中,可以发现以前无法获得的有关微生物群落的信息。检测差异丰富的特征(例如物种或基因)在揭示微生物样本的生物学或医学状态的促成因素(即病原体)方面起着关键作用。然而,目前可用的统计方法在检测不同生物学或医学条件下差异丰富的特征时缺乏效力,特别是对于时间序列宏基因组测序数据。我们提出了一种新的程序metaDprof,它基于一种假设误差异质性的样条法构建,以应对通过比较不同时间的生物学/医学条件从宏基因组样本中检测差异丰富特征的挑战。它包括两个阶段:(i)对特征的全局检测和(ii)对显著特征的时间间隔检测。两个阶段的检测程序都有可靠的统计支持。

结果

与现有方法相比,新方法metaDprof在综合模拟研究中表现出最佳性能。它不仅可以准确检测与样本的生物学状况或疾病状态相关的特征,还可以准确检测差异出现的起始和结束时间点。所提出的方法还应用于一个真实的宏基因组数据集,结果为理解小鼠肠道微生物群与饮食类型之间的关系提供了一个有趣的视角。

可用性和实现方式

R代码和一个示例数据集可在https://cals.arizona.edu/∼anling/sbg/software.htm获取。

联系方式

anling@email.arizona.edu

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验