Suppr超能文献

病毒基因组和微生物16S rRNA的菌株解析从头宏基因组组装

Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs.

作者信息

Jochheim Annika, Jochheim Florian A, Kolodyazhnaya Alexandra, Morice Étienne, Steinegger Martin, Söding Johannes

机构信息

Quantitative and Computational Biology, Max-Planck Institute for Multidisciplinary Sciences, Göttingen, Germany.

International Max-Planck Research School for Genome Sciences, University of Göttingen, Göttingen, Germany.

出版信息

Microbiome. 2024 Oct 1;12(1):187. doi: 10.1186/s40168-024-01904-y.

Abstract

BACKGROUND

Metagenomics is a powerful approach to study environmental and human-associated microbial communities and, in particular, the role of viruses in shaping them. Viral genomes are challenging to assemble from metagenomic samples due to their genomic diversity caused by high mutation rates. In the standard de Bruijn graph assemblers, this genomic diversity leads to complex k-mer assembly graphs with a plethora of loops and bulges that are challenging to resolve into strains or haplotypes because variants more than the k-mer size apart cannot be phased. In contrast, overlap assemblers can phase variants as long as they are covered by a single read.

RESULTS

Here, we present PenguiN, a software for strain resolved assembly of viral DNA and RNA genomes and bacterial 16S rRNA from shotgun metagenomics. Its exhaustive detection of all read overlaps in linear time combined with a Bayesian model to select strain-resolved extensions allow it to assemble longer viral contigs, less fragmented genomes, and more strains than existing assembly tools, on both real and simulated datasets. We show a 3-40-fold increase in complete viral genomes and a 6-fold increase in bacterial 16S rRNA genes.

CONCLUSION

PenguiN is the first overlap-based assembler for viral genome and 16S rRNA assembly from large and complex metagenomic datasets, which we hope will facilitate studying the key roles of viruses in microbial communities. Video Abstract.

摘要

背景

宏基因组学是研究环境和人类相关微生物群落的有力方法,尤其适用于研究病毒在塑造这些群落中的作用。由于病毒基因组因高突变率而具有基因组多样性,从宏基因组样本中组装病毒基因组具有挑战性。在标准的德布鲁因图组装器中,这种基因组多样性会导致复杂的k-mer组装图,其中有大量的环和凸起,难以解析为菌株或单倍型,因为距离超过k-mer大小的变体无法进行定相。相比之下,重叠组装器可以对只要被单条读段覆盖的变体进行定相。

结果

在此,我们展示了PenguiN,这是一款用于从鸟枪法宏基因组学中解析病毒DNA和RNA基因组以及细菌16S rRNA菌株的软件。它在线性时间内对所有读段重叠进行详尽检测,并结合贝叶斯模型来选择解析菌株的延伸,这使得它在真实和模拟数据集上都能比现有组装工具组装出更长的病毒重叠群、片段化程度更低的基因组以及更多的菌株。我们展示了完整病毒基因组数量增加了3至40倍,细菌16S rRNA基因数量增加了6倍。

结论

PenguiN是首个基于重叠的用于从大型复杂宏基因组数据集中组装病毒基因组和16S rRNA的组装器,我们希望它将有助于研究病毒在微生物群落中的关键作用。视频摘要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c5/11443906/acc5758864e3/40168_2024_1904_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验