Suppr超能文献

利用深度变分自动编码器改进宏基因组的分类和组装。

Improved metagenome binning and assembly using deep variational autoencoders.

机构信息

Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.

出版信息

Nat Biotechnol. 2021 May;39(5):555-560. doi: 10.1038/s41587-020-00777-4. Epub 2021 Jan 4.

Abstract

Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop variational autoencoders for metagenomic binning (VAMB), a program that uses deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering. We show that a variational autoencoder is able to integrate these two distinct data types without any previous knowledge of the datasets. VAMB outperforms existing state-of-the-art binners, reconstructing 29-98% and 45% more near-complete (NC) genomes on simulated and real data, respectively. Furthermore, VAMB is able to separate closely related strains up to 99.5% average nucleotide identity (ANI), and reconstructed 255 and 91 NC Bacteroides vulgatus and Bacteroides dorei sample-specific genomes as two distinct clusters from a dataset of 1,000 human gut microbiome samples. We use 2,606 NC bins from this dataset to show that species of the human gut microbiome have different geographical distribution patterns. VAMB can be run on standard hardware and is freely available at https://github.com/RasmussenLab/vamb .

摘要

尽管宏基因组 bin 方法在最近取得了进展,但从宏基因组数据中重建微生物物种仍然具有挑战性。在这里,我们开发了用于宏基因组 bin 方法的变分自动编码器(VAMB),这是一个程序,它使用深度变分自动编码器在聚类之前对序列共现和 k-mer 分布信息进行编码。我们表明,变分自动编码器能够在没有任何数据集先验知识的情况下整合这两种不同的数据类型。VAMB 优于现有的最先进的 bin 方法,在模拟和真实数据上分别重建了 29%-98%和 45%以上的近完整(NC)基因组。此外,VAMB 能够分离平均核苷酸同一性(ANI)高达 99.5%的密切相关菌株,并且能够从 1000 个人类肠道微生物组样本的数据集中将 255 个和 91 个 NC Bacteroides vulgatus 和 Bacteroides dorei 样本特异性基因组重建为两个不同的簇。我们使用来自该数据集的 2606 个 NC 来证明人类肠道微生物组的物种具有不同的地理分布模式。VAMB 可以在标准硬件上运行,并可在 https://github.com/RasmussenLab/vamb 上免费获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验