• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

罗盘箱:利用支架提高宏基因组分箱的连续性和质量

Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins.

作者信息

Muralidharan Harihara Subrahmaniam, Shah Nidhi, Meisel Jacquelyn S, Pop Mihai

机构信息

Pop Lab, Department of Computer Science, Center for Bioinformatics and Computational Biology, UMIACS, University of Maryland, College Park, MD, United States.

出版信息

Front Microbiol. 2021 Feb 24;12:638561. doi: 10.3389/fmicb.2021.638561. eCollection 2021.

DOI:10.3389/fmicb.2021.638561
PMID:33717033
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7945042/
Abstract

High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs inferred to originate from the same organism. Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics, such as mobile elements. Here, we propose that information from assembly graphs can assist current strategies for metagenomic binning. We use MetaCarvel, a metagenomic scaffolding tool, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads. We developed a tool, Binnacle, that extracts information from the assembly graphs and clusters scaffolds into comprehensive bins. Binnacle also provides wrapper scripts to integrate with existing binning methods. The Binnacle pipeline can be found on GitHub (https://github.com/marbl/binnacle). We show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed.

摘要

高通量测序彻底改变了微生物学领域,然而,从全宏基因组鸟枪法测序数据中重建生物体的完整基因组仍然是一项挑战。由于生物体丰度不均、基因组内和基因组间的重复序列、测序错误以及菌株水平的变异,所恢复的基因组通常高度碎片化。为了解决宏基因组组装的碎片化问题,科学家们依赖于一种称为分箱的过程,该过程将推断来自同一生物体的重叠群聚集在一起。现有的分箱算法利用样本内和样本间的寡核苷酸频率以及重叠群丰度(覆盖度)将来自同一生物体的重叠群分组。然而,这些算法常常遗漏短重叠群以及来自具有异常覆盖度或DNA组成特征(如移动元件)区域的重叠群。在此,我们提出组装图中的信息可以辅助当前的宏基因组分箱策略。我们使用宏基因组支架搭建工具MetaCarvel来构建组装图,其中重叠群为节点,边则基于双端读段推断得出。我们开发了一个名为Binnacle的工具,它从组装图中提取信息并将支架聚类到综合的箱中。Binnacle还提供包装脚本以与现有的分箱方法集成。Binnacle流程可在GitHub(https://github.com/marbl/binnacle)上找到。我们表明,基于分箱图的支架而非重叠群,提高了所得箱的连续性和质量,并捕获了更广泛的正在重建的生物体基因集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/d2313731a625/fmicb-12-638561-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/fbda7d138075/fmicb-12-638561-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/f88f23d60f52/fmicb-12-638561-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/371be2e94d00/fmicb-12-638561-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/22093e490341/fmicb-12-638561-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/fbcff4ce7f22/fmicb-12-638561-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/3c10201fa921/fmicb-12-638561-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/e836aaba1483/fmicb-12-638561-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/e37d9abd38c9/fmicb-12-638561-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/d2313731a625/fmicb-12-638561-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/fbda7d138075/fmicb-12-638561-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/f88f23d60f52/fmicb-12-638561-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/371be2e94d00/fmicb-12-638561-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/22093e490341/fmicb-12-638561-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/fbcff4ce7f22/fmicb-12-638561-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/3c10201fa921/fmicb-12-638561-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/e836aaba1483/fmicb-12-638561-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/e37d9abd38c9/fmicb-12-638561-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6f5/7945042/d2313731a625/fmicb-12-638561-g009.jpg

相似文献

1
Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins.罗盘箱:利用支架提高宏基因组分箱的连续性和质量
Front Microbiol. 2021 Feb 24;12:638561. doi: 10.3389/fmicb.2021.638561. eCollection 2021.
2
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.METAMVGL:一种基于多视图图的宏基因组序列拼接 bin 算法,通过整合组装图和配对末端图。
BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4.
3
Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs.基于组成、覆盖度和组装图对宏基因组序列进行精确分箱。
J Comput Biol. 2022 Dec;29(12):1357-1376. doi: 10.1089/cmb.2022.0262. Epub 2022 Nov 11.
4
HiFine: integrating Hi-C-based and shotgun-based methods to refine binning of metagenomic contigs.HiFine:整合基于 Hi-C 和 shotgun 的方法来优化宏基因组 contigs 的 bin 划分。
Bioinformatics. 2022 May 26;38(11):2973-2979. doi: 10.1093/bioinformatics/btac295.
5
Improving metagenomic binning results with overlapped bins using assembly graphs.利用组装图通过重叠分箱改进宏基因组分箱结果。
Algorithms Mol Biol. 2021 May 4;16(1):3. doi: 10.1186/s13015-021-00185-6.
6
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
7
Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.使用[公式:见正文]寡核苷酸频率差异改进宏基因组数据的重叠群分箱
BMC Bioinformatics. 2017 Sep 20;18(1):425. doi: 10.1186/s12859-017-1835-1.
8
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision.CoMet:一种使用 contig 覆盖度和组成进行宏基因组样本高精度分箱的工作流程。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):571. doi: 10.1186/s12859-017-1967-3.
9
Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes.分子长读长测序有助于复杂土壤宏基因组的组装和基因组分箱。
mSystems. 2016 Jun 28;1(3). doi: 10.1128/mSystems.00045-16. eCollection 2016 May-Jun.
10
Metagenomic binning with assembly graph embeddings.基于组装图嵌入的宏基因组 bin 划分。
Bioinformatics. 2022 Sep 30;38(19):4481-4487. doi: 10.1093/bioinformatics/btac557.

引用本文的文献

1
Analysis of metagenomic data.宏基因组数据的分析
Nat Rev Methods Primers. 2025;5. doi: 10.1038/s43586-024-00376-6. Epub 2025 Jan 23.
2
Unveiling microbial diversity: harnessing long-read sequencing technology.揭示微生物多样性:利用长读长测序技术
Nat Methods. 2024 Jun;21(6):954-966. doi: 10.1038/s41592-024-02262-1. Epub 2024 Apr 30.
3
BinSPreader: Refine binning results for fuller MAG reconstruction.BinSPreader:优化分箱结果以实现更完整的宏基因组组装基因组(MAG)重建。

本文引用的文献

1
Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.Merqury:基因组组装的无参考质量、完整性和相位评估。
Genome Biol. 2020 Sep 14;21(1):245. doi: 10.1186/s13059-020-02134-9.
2
Assembling Reads Improves Taxonomic Classification of Species.组装reads 可提高物种的分类学分类。
Genes (Basel). 2020 Aug 17;11(8):946. doi: 10.3390/genes11080946.
3
Subsurface carbon monoxide oxidation capacity revealed through genome-resolved metagenomics of a carboxydotroph.通过对羧化菌的基因组解析宏基因组学揭示了地下一氧化碳氧化能力。
iScience. 2022 Jul 19;25(8):104770. doi: 10.1016/j.isci.2022.104770. eCollection 2022 Aug 19.
Environ Microbiol Rep. 2020 Oct;12(5):525-533. doi: 10.1111/1758-2229.12868. Epub 2020 Jul 23.
4
Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.单体型解析基因组为安格斯牛和婆罗门牛的结构变异和基因组成提供了新的见解。
Nat Commun. 2020 Apr 29;11(1):2071. doi: 10.1038/s41467-020-15848-y.
5
Binning unassembled short reads based on k-mer abundance covariance using sparse coding.基于 k-mer 丰度协方差的稀疏编码对未组装的短读进行分箱。
Gigascience. 2020 Apr 1;9(4). doi: 10.1093/gigascience/giaa028.
6
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
7
Improved metagenomic analysis with Kraken 2.Kraken 2 提升宏基因组分析。
Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.
8
Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT.使用 CAT 和 BAT 对未知微生物序列和菌群进行稳健的分类学分类。
Genome Biol. 2019 Oct 22;20(1):217. doi: 10.1186/s13059-019-1817-x.
9
MetaCarvel: linking assembly graph motifs to biological variants.MetaCarvel:将组装图基序与生物变体联系起来。
Genome Biol. 2019 Aug 26;20(1):174. doi: 10.1186/s13059-019-1791-3.
10
MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies.MetaBAT 2:一种用于从宏基因组组装中进行稳健且高效的基因组重建的自适应分箱算法。
PeerJ. 2019 Jul 26;7:e7359. doi: 10.7717/peerj.7359. eCollection 2019.