Suppr超能文献

binny:一种自动化的分箱算法,可从复杂的宏基因组数据集中恢复高质量的基因组。

binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets.

出版信息

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac431.

Abstract

The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses k-mer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete ($\gt 95%$ pure, $\gt 90%$ complete) and high-quality ($\gt 90%$ pure, $\gt 70%$ complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.

摘要

基因组重建是基因组解析宏基因组学和微生物群落多组学数据整合的关键步骤。在这里,我们介绍了 binny,这是一种分箱工具,可从连续和高度碎片化的基因组中生成高质量的宏基因组组装基因组(MAG)。基于既定的指标,binny 的性能优于或与常用的最先进的分箱方法具有高度竞争力,并发现了其他方法无法检测到的独特基因组。binny 使用 k-mer 组成和宏基因组读数的覆盖范围对基因组特征进行迭代、非线性降维,以及使用基于谱系特异性标记基因集的聚类评估进行后续自动连续聚类。与七种广泛使用的分箱算法相比,binny 提供了大量独特识别的 MAG,并几乎总是从 Critical Assessment of Metagenome Interpretation 倡议的模拟数据集以及由各种环境的宏基因组组成的实际基准中恢复最多接近完整的($\gt95%$纯,$\gt90%$完整)和高质量的($\gt90%$纯,$\gt70%$完整)基因组,而其他测试方法则恢复的高质量草案基因组要少得多,这是根据宏基因组组装基因组的最低信息标准定义的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5123/9677464/6d95ef671063/bbac431f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验