Institute of Information Science, Academia Sinica, Taipei, 115, Taiwan.
Bioinformatics. 2011 Dec 15;27(24):3341-7. doi: 10.1093/bioinformatics/btr583. Epub 2011 Oct 20.
Metagenomics involves sampling and studying the genetic materials in microbial communities. Several statistical methods have been proposed for comparative analysis of microbial community compositions. Most of the methods are based on the estimated abundances of taxonomic units or functional groups from metagenomic samples. However, such estimated abundances might deviate from the true abundances in habitats due to sampling biases and other systematic artifacts in metagenomic data processing.
We developed the MetaRank scheme to convert abundances into ranks. MetaRank employs a series of statistical hypothesis tests to compare abundances within a microbial community and determine their ranks. We applied MetaRank to synthetic samples and real metagenomes. The results confirm that MetaRank can reduce the effects of sampling biases and clarify the characteristics of metagenomes in comparative studies of microbial communities. Therefore, MetaRank provides a useful rank-based approach to analyzing microbiomes.
Supplementary data are available at Bioinformatics online.
宏基因组学涉及对微生物群落中的遗传物质进行采样和研究。已经提出了几种统计方法来进行微生物群落组成的比较分析。大多数方法都是基于从宏基因组样本中估计的分类单元或功能组的丰度。然而,由于宏基因组数据处理中的采样偏差和其他系统伪影,这种估计的丰度可能与栖息地中的真实丰度有偏差。
我们开发了 MetaRank 方案将丰度转换为秩。MetaRank 采用一系列统计假设检验来比较微生物群落内的丰度,并确定它们的秩。我们将 MetaRank 应用于合成样本和真实宏基因组。结果证实,MetaRank 可以减少采样偏差的影响,并在微生物群落的比较研究中阐明宏基因组的特征。因此,MetaRank 为分析微生物组提供了一种有用的基于秩的方法。
补充资料可在“生物信息学在线”获取。