Suppr超能文献

86 个微生物和病毒宏基因组的宏基因组特征。

Metagenomic signatures of 86 microbial and viral metagenomes.

机构信息

Department of Biology, San Diego State University, 5500 Campanile Dr., San Diego, CA 92182, USA.

出版信息

Environ Microbiol. 2009 Jul;11(7):1752-66. doi: 10.1111/j.1462-2920.2009.01901.x. Epub 2009 Mar 18.

Abstract

Previous studies have shown that dinucleotide abundances capture the majority of variation in genome signatures and are useful for quantifying lateral gene transfer and building molecular phylogenies. Metagenomes contain a mixture of individual genomes, and might be expected to lack compositional signatures. In many metagenomic data sets the majority of sequences have no significant similarities to known sequences and are effectively excluded from subsequent analyses. To circumvent this limitation, di-, tri- and tetranucleotide abundances of 86 microbial and viral metagenomes consisting of short pyrosequencing reads were analysed to provide a method which includes all sequences that can be used in combination with other analysis to increase our knowledge about microbial and viral communities. Both principal component analysis and hierarchical clustering showed definitive groupings of metagenomes drawn from similar environments. Together these analyses showed that dinucleotide composition, as opposed to tri- and tetranucleotides, defines a metagenomic signature which can explain up to 80% of the variance between biomes, which is comparable to that obtained by functional genomics. Metagenomes with anomalous content were also identified using dinucleotide abundances. Subsequent analyses determined that these metagenomes were contaminated with exogenous DNA, suggesting that this approach is a useful metric for quality control. The predictive strength of the dinucleotide composition also opens the possibility of assigning ecological classifications to unknown fragments. Environmental selection may be responsible for this dinucleotide signature through direct selection of specific compositional signals; however, simulations suggest that the environment may select indirectly by promoting the increased abundance of a few dominant taxa.

摘要

先前的研究表明,二核苷酸丰度可以捕捉基因组特征的大部分变化,并且对于量化横向基因转移和构建分子系统发育非常有用。宏基因组包含了个体基因组的混合物,因此可能缺乏组成特征。在许多宏基因组数据集中,大多数序列与已知序列没有显著相似性,因此实际上被排除在后续分析之外。为了克服这一限制,分析了 86 个由短 pyrosequencing 读取组成的微生物和病毒宏基因组的二核苷酸、三核苷酸和四核苷酸丰度,提供了一种方法,该方法包括所有可以与其他分析结合使用的序列,以增加我们对微生物和病毒群落的了解。主成分分析和层次聚类都显示出从相似环境中提取的宏基因组的明确分组。这些分析表明,二核苷酸组成(而不是三核苷酸和四核苷酸)定义了一个宏基因组特征,可以解释高达 80%的生物群落之间的方差,这与功能基因组学获得的结果相当。还使用二核苷酸丰度识别了具有异常含量的宏基因组。随后的分析确定这些宏基因组受到外源 DNA 的污染,表明该方法是一种有用的质量控制指标。二核苷酸组成的预测能力也为未知片段的生态分类开辟了可能性。环境选择可能通过直接选择特定的组成信号来导致这种二核苷酸特征;然而,模拟表明,环境可能通过促进少数优势类群的丰度增加而间接地选择。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验