• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于组装图嵌入的宏基因组 bin 划分。

Metagenomic binning with assembly graph embeddings.

机构信息

Department of Computer Science, Aalborg University, 9000 Aalborg, Denmark.

Center for Microbial Communities, Department of Chemistry and Bioscience, Aalborg University, 9000 Aalborg, Denmark.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4481-4487. doi: 10.1093/bioinformatics/btac557.

DOI:10.1093/bioinformatics/btac557
PMID:35972375
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9525014/
Abstract

MOTIVATION

Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.

RESULTS

We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.

AVAILABILITY AND IMPLEMENTATION

GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

尽管测序技术和组装方法最近有了进步,但从宏基因组样本中获得高质量的微生物基因组仍然不是一件简单的任务。当前的宏基因组分类器没有充分利用组装图,也没有针对长读长组装进行优化。在其他领域已经提出了深度图学习算法来处理复杂的图数据结构。在组装过程中生成的图结构可以与 contig 特征集成,以便通过深度学习获得更好的分类。

结果

我们提出了 GraphMB,它使用图神经网络将组装图纳入分类过程。我们在不同复杂度的长读数据集上测试了 GraphMB,并根据获得的高质量(HQ)基因组分类的数量与其他分类器进行了性能比较。使用我们的方法,我们能够在所有真实数据集上获得独特的分类,并且在大多数数据集上获得更多的分类。特别是,与最先进的分类器相比,我们平均获得了 17.5%的 HQ 分类,而将我们的分类器与其他分类器的结果聚合在一起时,则获得了 13.7%的 HQ 分类。这些结果表明,深度学习模型可以整合 contig 特异性和图结构信息,以提高宏基因组分类。

可用性和实现

GraphMB 可从 https://github.com/MicrobialDarkMatter/GraphMB 获得。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44d4/9525014/a0ed69f9b09b/btac557f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44d4/9525014/a0ed69f9b09b/btac557f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44d4/9525014/a0ed69f9b09b/btac557f1.jpg

相似文献

1
Metagenomic binning with assembly graph embeddings.基于组装图嵌入的宏基因组 bin 划分。
Bioinformatics. 2022 Sep 30;38(19):4481-4487. doi: 10.1093/bioinformatics/btac557.
2
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。
BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.
3
Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs.基于组成、覆盖度和组装图对宏基因组序列进行精确分箱。
J Comput Biol. 2022 Dec;29(12):1357-1376. doi: 10.1089/cmb.2022.0262. Epub 2022 Nov 11.
4
SemiBin2: self-supervised contrastive learning leads to better MAGs for short- and long-read sequencing.半Bin2:自监督对比学习可提高短读长读测序的宏基因组组装质量。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i21-i29. doi: 10.1093/bioinformatics/btad209.
5
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
6
Binning_refiner: improving genome bins through the combination of different binning programs.Bin 精炼工具:通过结合不同的 Bin 划分程序来改进基因组 Bin 划分。
Bioinformatics. 2017 Jun 15;33(12):1873-1875. doi: 10.1093/bioinformatics/btx086.
7
HiFine: integrating Hi-C-based and shotgun-based methods to refine binning of metagenomic contigs.HiFine:整合基于 Hi-C 和 shotgun 的方法来优化宏基因组 contigs 的 bin 划分。
Bioinformatics. 2022 May 26;38(11):2973-2979. doi: 10.1093/bioinformatics/btac295.
8
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.METAMVGL:一种基于多视图图的宏基因组序列拼接 bin 算法,通过整合组装图和配对末端图。
BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4.
9
CoCoNet: an efficient deep learning tool for viral metagenome binning.CoCoNet:一种用于病毒宏基因组分箱的高效深度学习工具。
Bioinformatics. 2021 Sep 29;37(18):2803-2810. doi: 10.1093/bioinformatics/btab213.
10
MetaBinner: a high-performance and stand-alone ensemble binning method to recover individual genomes from complex microbial communities.MetaBinner:一种高性能、独立的组装分类方法,可从复杂微生物群落中回收单个基因组。
Genome Biol. 2023 Jan 6;24(1):1. doi: 10.1186/s13059-022-02832-6.

引用本文的文献

1
Overcoming challenges in metagenomic AMR surveillance with nanopore sequencing: a case study on fluoroquinolone resistance.利用纳米孔测序克服宏基因组抗菌药物耐药性监测中的挑战:氟喹诺酮耐药性案例研究
Front Microbiol. 2025 Jul 23;16:1614301. doi: 10.3389/fmicb.2025.1614301. eCollection 2025.
2
Genome-resolved long-read sequencing expands known microbial diversity across terrestrial habitats.基因组解析长读长测序扩展了陆地生境中已知的微生物多样性。
Nat Microbiol. 2025 Jul 24. doi: 10.1038/s41564-025-02062-z.
3
A review of neural networks for metagenomic binning.

本文引用的文献

1
BinSPreader: Refine binning results for fuller MAG reconstruction.BinSPreader:优化分箱结果以实现更完整的宏基因组组装基因组(MAG)重建。
iScience. 2022 Jul 19;25(8):104770. doi: 10.1016/j.isci.2022.104770. eCollection 2022 Aug 19.
2
Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing.牛津纳米孔 R10.4 长读测序能够从纯培养物和宏基因组中生成近乎完成的细菌基因组,而无需进行短读测序或参考序列优化。
Nat Methods. 2022 Jul;19(7):823-826. doi: 10.1038/s41592-022-01539-7. Epub 2022 Jul 4.
3
Metagenome assembly of high-fidelity long reads with hifiasm-meta.
宏基因组分箱的神经网络综述。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf065.
4
Myxozoan parasite genomes assembled from contaminated host data reveal extensive gene order conservation and rapid sequence evolution.从受污染的宿主数据中组装的粘孢子虫寄生虫基因组揭示了广泛的基因顺序保守性和快速的序列进化。
G3 (Bethesda). 2025 Jul 9;15(7). doi: 10.1093/g3journal/jkaf061.
5
Deep learning in microbiome analysis: a comprehensive review of neural network models.微生物组分析中的深度学习:神经网络模型综述
Front Microbiol. 2025 Jan 22;15:1516667. doi: 10.3389/fmicb.2024.1516667. eCollection 2024.
6
Binning Metagenomic Contigs Using Contig Embedding and Decomposed Tetranucleotide Frequency.利用重叠群嵌入和分解四核苷酸频率对宏基因组重叠群进行分箱
Biology (Basel). 2024 Sep 24;13(10):755. doi: 10.3390/biology13100755.
7
Decomposing a San Francisco estuary microbiome using long-read metagenomics reveals species- and strain-level dominance from picoeukaryotes to viruses.利用长读长宏基因组学分解旧金山河口微生物组,揭示了从微微型真核生物到病毒的种属和菌株水平的优势。
mSystems. 2024 Sep 17;9(9):e0024224. doi: 10.1128/msystems.00242-24. Epub 2024 Aug 19.
8
Disentangling cobionts and contamination in long-read genomic data using sequence composition.利用序列组成解缠长读基因组数据中的共生物和污染。
G3 (Bethesda). 2024 Nov 6;14(11). doi: 10.1093/g3journal/jkae187.
9
Solving genomic puzzles: computational methods for metagenomic binning.解决基因组难题:宏基因组 binning 的计算方法。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae372.
10
Chlamydiae as symbionts of photosynthetic dinoflagellates.衣原体作为光合甲藻的共生体。
ISME J. 2024 Jan 8;18(1). doi: 10.1093/ismejo/wrae139.
利用 hifiasm-meta 进行高保真长读长的宏基因组组装。
Nat Methods. 2022 Jun;19(6):671-674. doi: 10.1038/s41592-022-01478-3. Epub 2022 May 9.
4
A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data.从宏基因组测序数据生成宏基因组组装基因组的计算工具综述。
Comput Struct Biotechnol J. 2021 Nov 23;19:6301-6314. doi: 10.1016/j.csbj.2021.11.028. eCollection 2021.
5
Evaluating Assembly and Binning Strategies for Time Series Drinking Water Metagenomes.评估时间序列饮用水宏基因组的组装和分类策略。
Microbiol Spectr. 2021 Dec 22;9(3):e0143421. doi: 10.1128/Spectrum.01434-21. Epub 2021 Nov 3.
6
STRONG: metagenomics strain resolution on assembly graphs.基于组装图的宏基因组菌株分辨率
Genome Biol. 2021 Jul 26;22(1):214. doi: 10.1186/s13059-021-02419-7.
7
Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing.利用长读测序从活性污泥中恢复超过 1000 个高质量宏基因组组装基因组,将结构与功能联系起来。
Nat Commun. 2021 Mar 31;12(1):2009. doi: 10.1038/s41467-021-22203-2.
8
Improved metagenome binning and assembly using deep variational autoencoders.利用深度变分自动编码器改进宏基因组的分类和组装。
Nat Biotechnol. 2021 May;39(5):555-560. doi: 10.1038/s41587-020-00777-4. Epub 2021 Jan 4.
9
metaFlye: scalable long-read metagenome assembly using repeat graphs.metaFlye:使用重复图进行可扩展的长读长宏基因组组装。
Nat Methods. 2020 Nov;17(11):1103-1110. doi: 10.1038/s41592-020-00971-x. Epub 2020 Oct 5.
10
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。
BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.