• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BinSanity:利用覆盖度和亲和传播对环境微生物组装体进行无监督聚类。

BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation.

作者信息

Graham Elaina D, Heidelberg John F, Tully Benjamin J

机构信息

Department of Biological Sciences, University of Southern California , Los Angeles , CA , USA.

Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA; Center for Dark Energy Biosphere Investigations, Los Angeles, CA, USA.

出版信息

PeerJ. 2017 Mar 8;5:e3035. doi: 10.7717/peerj.3035. eCollection 2017.

DOI:10.7717/peerj.3035
PMID:28289564
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5345454/
Abstract

Metagenomics has become an integral part of defining microbial diversity in various environments. Many ecosystems have characteristically low biomass and few cultured representatives. Linking potential metabolisms to phylogeny in environmental microorganisms is important for interpreting microbial community functions and the impacts these communities have on geochemical cycles. However, with metagenomic studies there is the computational hurdle of 'binning' contigs into phylogenetically related units or putative genomes. Binning methods have been implemented with varying approaches such as k-means clustering, Gaussian mixture models, hierarchical clustering, neural networks, and two-way clustering; however, many of these suffer from biases against low coverage/abundance organisms and closely related taxa/strains. We are introducing a new binning method, BinSanity, that utilizes the clustering algorithm affinity propagation (AP), to cluster assemblies using coverage with compositional based refinement (tetranucleotide frequency and percent GC content) to optimize bins containing multiple source organisms. This separation of composition and coverage based clustering reduces bias for closely related taxa. BinSanity was developed and tested on artificial metagenomes varying in size and complexity. Results indicate that BinSanity has a higher precision, recall, and Adjusted Rand Index compared to five commonly implemented methods. When tested on a previously published environmental metagenome, BinSanity generated high completion and low redundancy bins corresponding with the published metagenome-assembled genomes.

摘要

宏基因组学已成为定义各种环境中微生物多样性不可或缺的一部分。许多生态系统具有典型的低生物量且培养出的代表性微生物很少。将环境微生物的潜在代谢与系统发育联系起来对于解释微生物群落功能以及这些群落对地球化学循环的影响至关重要。然而,在宏基因组学研究中,存在将重叠群“归类”到系统发育相关单元或假定基因组中的计算障碍。归类方法已通过多种不同方法实现,如k均值聚类、高斯混合模型、层次聚类、神经网络和双向聚类;然而,其中许多方法存在对低覆盖度/丰度生物以及密切相关的分类群/菌株的偏差。我们正在引入一种新的归类方法BinSanity,它利用聚类算法亲和传播(AP),使用覆盖度并结合基于组成的细化(四核苷酸频率和GC含量百分比)对组装序列进行聚类,以优化包含多种来源生物的分类单元。这种基于组成和覆盖度的聚类分离减少了对密切相关分类群的偏差。BinSanity是在大小和复杂度各异的人工宏基因组上开发和测试的。结果表明,与五种常用方法相比,BinSanity具有更高的精度、召回率和调整兰德指数。在先前发表的环境宏基因组上进行测试时,BinSanity生成了与已发表的宏基因组组装基因组相对应的高完整性和低冗余度的分类单元。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/3a5208fc23a9/peerj-05-3035-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/00703ddd34ac/peerj-05-3035-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/e023866c524b/peerj-05-3035-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/aa3ece46470b/peerj-05-3035-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/b010b2d4617a/peerj-05-3035-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/8966b8f1f640/peerj-05-3035-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/3a5208fc23a9/peerj-05-3035-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/00703ddd34ac/peerj-05-3035-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/e023866c524b/peerj-05-3035-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/aa3ece46470b/peerj-05-3035-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/b010b2d4617a/peerj-05-3035-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/8966b8f1f640/peerj-05-3035-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c99a/5345454/3a5208fc23a9/peerj-05-3035-g006.jpg

相似文献

1
BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation.BinSanity:利用覆盖度和亲和传播对环境微生物组装体进行无监督聚类。
PeerJ. 2017 Mar 8;5:e3035. doi: 10.7717/peerj.3035. eCollection 2017.
2
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision.CoMet:一种使用 contig 覆盖度和组成进行宏基因组样本高精度分箱的工作流程。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):571. doi: 10.1186/s12859-017-1967-3.
3
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.优化和评估宏基因组组装微生物基因组的重建。
BMC Genomics. 2017 Nov 28;18(1):915. doi: 10.1186/s12864-017-4294-1.
4
A scalable assembly-free variable selection algorithm for biomarker discovery from metagenomes.一种用于从宏基因组中发现生物标志物的可扩展无组装变量选择算法。
BMC Bioinformatics. 2016 Aug 19;17(1):311. doi: 10.1186/s12859-016-1186-3.
5
MetaConClust - Unsupervised Binning of Metagenomics Data using Consensus Clustering.MetaConClust——使用一致性聚类对宏基因组学数据进行无监督分箱
Curr Genomics. 2022 Jun 10;23(2):137-146. doi: 10.2174/1389202923666220413114659.
6
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。
BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.
7
Unsupervised Binning of Metagenomic Assembled Contigs Using Improved Fuzzy C-Means Method.基于改进模糊 C 均值方法的宏基因组组装 contigs 无监督分箱。
IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1459-1467. doi: 10.1109/TCBB.2016.2576452. Epub 2016 Jun 7.
8
MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage.MetaCon:基于概率 k- -mer 统计和覆盖度的无监督宏基因组序列聚类
BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):367. doi: 10.1186/s12859-019-2904-4.
9
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
10
Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes.分子长读长测序有助于复杂土壤宏基因组的组装和基因组分箱。
mSystems. 2016 Jun 28;1(3). doi: 10.1128/mSystems.00045-16. eCollection 2016 May-Jun.

引用本文的文献

1
A ubiquitous and diverse methanogenic community drives microbial methane cycling in eutrophic coastal sediments.一个无处不在且多样的产甲烷群落驱动着富营养化沿海沉积物中的微生物甲烷循环。
FEMS Microbiol Ecol. 2025 Jul 14;101(8). doi: 10.1093/femsec/fiaf075.
2
Genome-resolved metagenomics from short-read sequencing data in the era of artificial intelligence.人工智能时代基于短读长测序数据的基因组解析宏基因组学
Funct Integr Genomics. 2025 Jun 10;25(1):124. doi: 10.1007/s10142-025-01625-x.
3
Chemoautotrophy in subzero environments and the potential for cold-adapted Rubisco.

本文引用的文献

1
COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge.可口可乐:利用序列组成、读段覆盖度、共比对和双端读段连接对宏基因组重叠群进行分箱。
Bioinformatics. 2017 Mar 15;33(6):791-798. doi: 10.1093/bioinformatics/btw290.
2
Potential Mechanisms for Microbial Energy Acquisition in Oxic Deep-Sea Sediments.有氧深海沉积物中微生物获取能量的潜在机制
Appl Environ Microbiol. 2016 Jun 30;82(14):4232-43. doi: 10.1128/AEM.01023-16. Print 2016 Jul 15.
3
Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.
零下环境中的化学自养以及冷适应型核酮糖-1,5-二磷酸羧化酶/加氧酶的潜力。
Appl Environ Microbiol. 2025 Jun 18;91(6):e0060425. doi: 10.1128/aem.00604-25. Epub 2025 May 30.
4
Seasonal transcriptomic shifts reveal metabolic flexibility of chemosynthetic symbionts in an upwelling region.季节性转录组变化揭示了上升流区域中化能合成共生体的代谢灵活性。
mSystems. 2025 Jun 17;10(6):e0168624. doi: 10.1128/msystems.01686-24. Epub 2025 May 22.
5
Diversification, niche adaptation, and evolution of a candidate phylum thriving in the deep Critical Zone.一个在深部关键带蓬勃发展的候选门的多样化、生态位适应与进化
Proc Natl Acad Sci U S A. 2025 Mar 25;122(12):e2424463122. doi: 10.1073/pnas.2424463122. Epub 2025 Mar 18.
6
Disentangling the microbial genomic traits associated with aromatic hydrocarbon degradation in a jet fuel-contaminated aquifer.解析受喷气燃料污染含水层中与芳烃降解相关的微生物基因组特征。
Biodegradation. 2024 Nov 18;36(1):7. doi: 10.1007/s10532-024-10100-6.
7
MAGqual: a stand-alone pipeline to assess the quality of metagenome-assembled genomes.MAGqual:一种独立的用于评估宏基因组组装基因组质量的管道。
Microbiome. 2024 Nov 4;12(1):226. doi: 10.1186/s40168-024-01949-z.
8
Microbial acidification by N, S, Fe and Mn oxidation as a key mechanism for deterioration of subsea tunnel sprayed concrete.微生物酸化作用通过 N、S、Fe 和 Mn 的氧化作用,是海底隧道喷射混凝土劣化的关键机制。
Sci Rep. 2024 Sep 30;14(1):22742. doi: 10.1038/s41598-024-73911-w.
9
Solving genomic puzzles: computational methods for metagenomic binning.解决基因组难题:宏基因组 binning 的计算方法。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae372.
10
Characterization of CRISPR-Cas systems.CRISPR-Cas 系统的特性分析。
mSphere. 2024 Jul 30;9(7):e0017124. doi: 10.1128/msphere.00171-24. Epub 2024 Jul 11.
通过利用基因组特征和标记基因信息对序列进行自动聚类,实现宏基因组重叠群的精确分类。
Sci Rep. 2016 Apr 12;6:24175. doi: 10.1038/srep24175.
4
A distinct and active bacterial community in cold oxygenated fluids circulating beneath the western flank of the Mid-Atlantic ridge.在大西洋中脊西侧下方循环的冷氧化流体中,存在一个独特且活跃的细菌群落。
Sci Rep. 2016 Mar 3;6:22541. doi: 10.1038/srep22541.
5
Anvi'o: an advanced analysis and visualization platform for 'omics data.Anvi'o:一个用于“组学”数据的高级分析和可视化平台。
PeerJ. 2015 Oct 8;3:e1319. doi: 10.7717/peerj.1319. eCollection 2015.
6
Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community.文库制备方案和模板量对模拟微生物群落宏基因组重建的影响
BMC Genomics. 2015 Oct 24;16:856. doi: 10.1186/s12864-015-2063-6.
7
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.MetaBAT是一种从复杂微生物群落中准确重建单个基因组的高效工具。
PeerJ. 2015 Aug 27;3:e1165. doi: 10.7717/peerj.1165. eCollection 2015.
8
Metagenomic resolution of microbial functions in deep-sea hydrothermal plumes across the Eastern Lau Spreading Center.东劳扩张中心深海热液羽流中微生物功能的宏基因组解析。
ISME J. 2016 Jan;10(1):225-39. doi: 10.1038/ismej.2015.81. Epub 2015 Jun 5.
9
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.CheckM:评估从分离株、单细胞和宏基因组中获得的微生物基因组质量。
Genome Res. 2015 Jul;25(7):1043-55. doi: 10.1101/gr.186072.114. Epub 2015 May 14.
10
GroopM: an automated tool for the recovery of population genomes from related metagenomes.GroopM:一种从相关宏基因组中恢复种群基因组的自动化工具。
PeerJ. 2014 Sep 30;2:e603. doi: 10.7717/peerj.603. eCollection 2014.