• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用MetaSBT和分类学感知序列布隆树对大规模微生物暗物质进行表征。

Characterization of microbial dark matter at scale with MetaSBT and taxonomy-aware Sequence Bloom Trees.

作者信息

Cumbo Fabio, Blankenberg Daniel

出版信息

bioRxiv. 2025 Aug 30:2025.08.25.672238. doi: 10.1101/2025.08.25.672238.

DOI:10.1101/2025.08.25.672238
PMID:40909705
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12407952/
Abstract

UNLABELLED

Metagenomics has become a powerful tool for studying microbial communities, allowing researchers to investigate microbial diversity within complex environmental samples. Recent advances in sequencing technology have enabled the recovery of near-complete microbial genomes directly from metagenomic samples, also known as metagenome-assembled genomes (MAGs). However, accurately characterizing these genomes remains a significant challenge due to the presence of sequencing errors, incomplete assembly, and contamination. Here we present MetaSBT, a new tool for organizing, indexing, and characterizing microbial reference genomes and MAGs. It is able to identify clusters of genomes at all seven taxonomic levels, from the kingdom all the way down to the species level, using the Sequence Bloom Tree (SBT) data structure that relies on Bloom Filters (BFs) to index massive amounts of genomes based on their k-mers composition. We have built an initial set of databases composed of over 190 thousand viral genomes from NCBI GenBank and public sources grouped into sequence consistent clusters at different taxonomic levels, making it the first software solution for the classification of viruses at different ranks, including still unknown ones. This results in the definition of over 40 thousand species clusters where ∼80% do not match with any known viral species in reference databases to date. Furthermore, we show how our databases can be used as a new basis for existing quantitative metagenomic profilers to unlock the detection of unknown microbes and the estimation of their abundance in metagenomic samples. Finally, the framework is released open-source and, along with its public databases, is fully integrated into the Galaxy Platform enabling broad accessibility.

IMPORTANCE

The MetaSBT framework and its databases, together with its integration in the Galaxy Platform, provide a powerful resource for microbial research. MetaSBT provides a powerful and scalable approach for classifying microbial genomes, including previously unknown ones. This facilitates the discovery and characterization of novel taxa, a crucial feature for expanding our knowledge of microbial diversity and its implications within host health and environmental factors. Furthermore, MetaSBT databases can serve as a reference base for other state-of-the-art tools, enhancing their capabilities to identify, analyze, and classify unknown microbes in metagenomic samples.

摘要

未标注

宏基因组学已成为研究微生物群落的强大工具,使研究人员能够调查复杂环境样本中的微生物多样性。测序技术的最新进展使得能够直接从宏基因组样本中获得近乎完整的微生物基因组,即宏基因组组装基因组(MAGs)。然而,由于存在测序错误、组装不完整和污染等问题,准确表征这些基因组仍然是一项重大挑战。在此,我们展示了MetaSBT,这是一种用于组织、索引和表征微生物参考基因组及MAGs的新工具。它能够使用基于布隆过滤器(BFs)的序列布隆树(SBT)数据结构,在从界到种的所有七个分类水平上识别基因组簇,该数据结构基于k-mer组成对大量基因组进行索引。我们构建了一组初始数据库,其中包含来自NCBI GenBank和公共来源的超过19万个病毒基因组,这些基因组在不同分类水平上被分组为序列一致的簇,使其成为第一个针对不同等级病毒分类的软件解决方案,包括尚未知晓的病毒。这导致定义了超过4万个物种簇,其中约80%与目前参考数据库中任何已知病毒物种均不匹配。此外,我们展示了我们的数据库如何能够作为现有定量宏基因组分析工具的新基础,以实现对未知微生物的检测及其在宏基因组样本中丰度的估计。最后,该框架以开源形式发布,并且连同其公共数据库一起,完全集成到Galaxy平台中,实现了广泛的可访问性。

重要性

MetaSBT框架及其数据库,连同其在Galaxy平台中的集成,为微生物研究提供了强大的资源。MetaSBT为分类微生物基因组(包括以前未知的基因组)提供了一种强大且可扩展的方法。这有助于发现和表征新的分类单元,这是扩展我们对微生物多样性及其在宿主健康和环境因素中的影响的认识的关键特征。此外,MetaSBT数据库可以作为其他先进工具的参考基础,增强它们在宏基因组样本中识别、分析和分类未知微生物的能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/afadb97d463c/nihpp-2025.08.25.672238v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/f2a7e1e184b8/nihpp-2025.08.25.672238v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/5d47a343f0c1/nihpp-2025.08.25.672238v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/09731976db79/nihpp-2025.08.25.672238v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/afadb97d463c/nihpp-2025.08.25.672238v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/f2a7e1e184b8/nihpp-2025.08.25.672238v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/5d47a343f0c1/nihpp-2025.08.25.672238v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/09731976db79/nihpp-2025.08.25.672238v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b63f/12407952/afadb97d463c/nihpp-2025.08.25.672238v1-f0004.jpg

相似文献

1
Characterization of microbial dark matter at scale with MetaSBT and taxonomy-aware Sequence Bloom Trees.使用MetaSBT和分类学感知序列布隆树对大规模微生物暗物质进行表征。
bioRxiv. 2025 Aug 30:2025.08.25.672238. doi: 10.1101/2025.08.25.672238.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
4
Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验:定性证据综合。
Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.
5
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
6
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状Meta分析。
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.
7
Short-Term Memory Impairment短期记忆障碍
8
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
9
Systemic Inflammatory Response Syndrome全身炎症反应综合征
10
The clinical effectiveness and cost-effectiveness of enzyme replacement therapy for Gaucher's disease: a systematic review.戈谢病酶替代疗法的临床疗效和成本效益:一项系统评价。
Health Technol Assess. 2006 Jul;10(24):iii-iv, ix-136. doi: 10.3310/hta10240.