Suppr超能文献

评估针对生物特有基因目录的宏基因组组装方法。

Evaluating metagenomic assembly approaches for biome-specific gene catalogues.

机构信息

Department of Gene Technology, Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Stockholm, Sweden.

出版信息

Microbiome. 2022 May 6;10(1):72. doi: 10.1186/s40168-022-01259-2.

Abstract

BACKGROUND

For many environments, biome-specific microbial gene catalogues are being recovered using shotgun metagenomics followed by assembly and gene calling on the assembled contigs. The assembly is typically conducted either by individually assembling each sample or by co-assembling reads from all the samples. The co-assembly approach can potentially recover genes that display too low abundance to be assembled from individual samples. On the other hand, combining samples increases the risk of mixing data from closely related strains, which can hamper the assembly process. In this respect, assembly on individual samples followed by clustering of (near) identical genes is preferable. Thus, both approaches have potential pros and cons, but it remains to be evaluated which assembly strategy is most effective. Here, we have evaluated three assembly strategies for generating gene catalogues from metagenomes using a dataset of 124 samples from the Baltic Sea: (1) assembly on individual samples followed by clustering of the resulting genes, (2) co-assembly on all samples, and (3) mix assembly, combining individual and co-assembly.

RESULTS

The mix-assembly approach resulted in a more extensive nonredundant gene set than the other approaches and with more genes predicted to be complete and that could be functionally annotated. The mix assembly consists of 67 million genes (Baltic Sea gene set, BAGS) that have been functionally and taxonomically annotated. The majority of the BAGS genes are dissimilar (< 95% amino acid identity) to the Tara Oceans gene dataset, and hence, BAGS represents a valuable resource for brackish water research.

CONCLUSION

The mix-assembly approach represents a feasible approach to increase the information obtained from metagenomic samples. Video abstract.

摘要

背景

对于许多环境,正在使用鸟枪法宏基因组学来获取特定于生物群落的微生物基因目录,然后对组装的连续体进行组装和基因调用。组装通常通过单独组装每个样本或组合来自所有样本的读取来进行。共组装方法有可能从单个样本中恢复到丰度太低而无法组装的基因。另一方面,合并样本会增加混合来自密切相关菌株的数据的风险,这会阻碍组装过程。在这方面,优先考虑在单个样本上进行组装,然后对(近)相同的基因进行聚类。因此,这两种方法都有潜在的优点和缺点,但仍需要评估哪种组装策略最有效。在这里,我们使用来自波罗的海的 124 个样本数据集评估了三种从宏基因组中生成基因目录的组装策略:(1)对单个样本进行组装,然后对生成的基因进行聚类,(2)对所有样本进行共组装,(3)混合组装,组合个体和共组装。

结果

混合组装方法产生的非冗余基因集比其他方法更广泛,并且预测有更多完整且可进行功能注释的基因。混合组装由 6700 万个基因组成(波罗的海基因集,BAGS),这些基因已经进行了功能和分类注释。BAGS 的大多数基因与 Tara Oceans 基因数据集不相似(<95%氨基酸同一性),因此,BAGS 是咸水研究的有价值资源。

结论

混合组装方法代表了一种可行的方法,可以增加从宏基因组样本中获得的信息。视频摘要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d0/9074274/682bfb4ef6cf/40168_2022_1259_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验