Suppr超能文献

基于组装图嵌入的宏基因组 bin 划分。

Metagenomic binning with assembly graph embeddings.

机构信息

Department of Computer Science, Aalborg University, 9000 Aalborg, Denmark.

Center for Microbial Communities, Department of Chemistry and Bioscience, Aalborg University, 9000 Aalborg, Denmark.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4481-4487. doi: 10.1093/bioinformatics/btac557.

Abstract

MOTIVATION

Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.

RESULTS

We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.

AVAILABILITY AND IMPLEMENTATION

GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

尽管测序技术和组装方法最近有了进步,但从宏基因组样本中获得高质量的微生物基因组仍然不是一件简单的任务。当前的宏基因组分类器没有充分利用组装图,也没有针对长读长组装进行优化。在其他领域已经提出了深度图学习算法来处理复杂的图数据结构。在组装过程中生成的图结构可以与 contig 特征集成,以便通过深度学习获得更好的分类。

结果

我们提出了 GraphMB,它使用图神经网络将组装图纳入分类过程。我们在不同复杂度的长读数据集上测试了 GraphMB,并根据获得的高质量(HQ)基因组分类的数量与其他分类器进行了性能比较。使用我们的方法,我们能够在所有真实数据集上获得独特的分类,并且在大多数数据集上获得更多的分类。特别是,与最先进的分类器相比,我们平均获得了 17.5%的 HQ 分类,而将我们的分类器与其他分类器的结果聚合在一起时,则获得了 13.7%的 HQ 分类。这些结果表明,深度学习模型可以整合 contig 特异性和图结构信息,以提高宏基因组分类。

可用性和实现

GraphMB 可从 https://github.com/MicrobialDarkMatter/GraphMB 获得。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44d4/9525014/a0ed69f9b09b/btac557f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验