• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RESCIPT:可重复序列分类法参考数据库管理。

RESCRIPt: Reproducible sequence taxonomy reference database management.

机构信息

University of Arkansas for Medical Sciences, Department of Biomedical Informatics, Little Rock, Arkansas, United States of America.

Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, Arizona, United States of America.

出版信息

PLoS Comput Biol. 2021 Nov 8;17(11):e1009581. doi: 10.1371/journal.pcbi.1009581. eCollection 2021 Nov.

DOI:10.1371/journal.pcbi.1009581
PMID:34748542
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8601625/
Abstract

Nucleotide sequence and taxonomy reference databases are critical resources for widespread applications including marker-gene and metagenome sequencing for microbiome analysis, diet metabarcoding, and environmental DNA (eDNA) surveys. Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases. Furthermore, database composition drastically influences results, and lack of standardization limits cross-study comparisons. To address these challenges, we developed RESCRIPt, a Python 3 software package and QIIME 2 plugin for reproducible generation and management of reference sequence taxonomy databases, including dedicated functions that streamline creating databases from popular sources, and functions for evaluating, comparing, and interactively exploring qualitative and quantitative characteristics across reference databases. To highlight the breadth and capabilities of RESCRIPt, we provide several examples for working with popular databases for microbiome profiling (SILVA, Greengenes, NCBI-RefSeq, GTDB), eDNA and diet metabarcoding surveys (BOLD, GenBank), as well as for genome comparison. We show that bigger is not always better, and reference databases with standardized taxonomies and those that focus on type strains have quantitative advantages, though may not be appropriate for all use cases. Most databases appear to benefit from some curation (quality filtering), though sequence clustering appears detrimental to database quality. Finally, we demonstrate the breadth and extensibility of RESCRIPt for reproducible workflows with a comparison of global hepatitis genomes. RESCRIPt provides tools to democratize the process of reference database acquisition and management, enabling researchers to reproducibly and transparently create reference materials for diverse research applications. RESCRIPt is released under a permissive BSD-3 license at https://github.com/bokulich-lab/RESCRIPt.

摘要

核苷酸序列和分类学参考数据库是广泛应用的关键资源,包括微生物组分析的标记基因和宏基因组测序、饮食代谢条形码和环境 DNA(eDNA)调查。可重复地生成、管理、使用和评估核苷酸序列和分类学参考数据库,为希望生成自定义序列数据库的研究人员带来了重大瓶颈。此外,数据库组成极大地影响结果,缺乏标准化限制了跨研究比较。为了解决这些挑战,我们开发了 RESCRIPt,这是一个用于可重复生成和管理参考序列分类学数据库的 Python 3 软件包和 QIIME 2 插件,包括专门的功能,可简化从流行来源创建数据库的过程,以及用于评估、比较和交互式探索参考数据库中定性和定量特征的功能。为了突出 RESCRIPt 的广度和功能,我们提供了一些示例,用于处理微生物组分析的流行数据库(SILVA、Greengenes、NCBI-RefSeq、GTDB)、eDNA 和饮食代谢条形码调查(BOLD、GenBank)以及基因组比较。我们表明,更大并不总是更好,具有标准化分类学的参考数据库和专注于模式菌株的参考数据库具有定量优势,尽管它们可能不适用于所有用例。大多数数据库似乎都受益于某种策管(质量过滤),尽管序列聚类似乎对数据库质量有害。最后,我们通过比较全球肝炎基因组来展示 RESCRIPt 用于可重复工作流程的广度和可扩展性。RESCRIPt 提供了工具,使参考数据库获取和管理的过程民主化,使研究人员能够为各种研究应用可重复和透明地创建参考材料。RESCRIPt 在 https://github.com/bokulich-lab/RESCRIPt 下以宽松的 BSD-3 许可证发布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/4524f7f6c849/pcbi.1009581.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/b163b122d58c/pcbi.1009581.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/9aa25fdac978/pcbi.1009581.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/6bf77f24c52d/pcbi.1009581.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/04e040350f54/pcbi.1009581.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/da0658645260/pcbi.1009581.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/416a6e97a501/pcbi.1009581.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/b47d80c65a19/pcbi.1009581.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/2cce0c12375a/pcbi.1009581.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/9a3854a7c47d/pcbi.1009581.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/6ddc541afa95/pcbi.1009581.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/18a2f2167a4f/pcbi.1009581.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/7bdb11d3c891/pcbi.1009581.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/0bdfb1f251af/pcbi.1009581.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/4524f7f6c849/pcbi.1009581.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/b163b122d58c/pcbi.1009581.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/9aa25fdac978/pcbi.1009581.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/6bf77f24c52d/pcbi.1009581.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/04e040350f54/pcbi.1009581.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/da0658645260/pcbi.1009581.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/416a6e97a501/pcbi.1009581.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/b47d80c65a19/pcbi.1009581.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/2cce0c12375a/pcbi.1009581.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/9a3854a7c47d/pcbi.1009581.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/6ddc541afa95/pcbi.1009581.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/18a2f2167a4f/pcbi.1009581.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/7bdb11d3c891/pcbi.1009581.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/0bdfb1f251af/pcbi.1009581.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c39a/8601625/4524f7f6c849/pcbi.1009581.g014.jpg

相似文献

1
RESCRIPt: Reproducible sequence taxonomy reference database management.RESCIPT:可重复序列分类法参考数据库管理。
PLoS Comput Biol. 2021 Nov 8;17(11):e1009581. doi: 10.1371/journal.pcbi.1009581. eCollection 2021 Nov.
2
crabs-A software program to generate curated reference databases for metabarcoding sequencing data.Crabs——一个用于为元条形码测序数据生成经过整理的参考数据库的软件程序。
Mol Ecol Resour. 2023 Apr;23(3):725-738. doi: 10.1111/1755-0998.13741. Epub 2022 Dec 11.
3
Metagenomics Databases for Bacteria.细菌宏基因组数据库。
Methods Mol Biol. 2023;2649:55-67. doi: 10.1007/978-1-0716-3072-3_3.
4
Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2.使用 Kraken 2 进行快速准确的 16S rRNA 微生物群落分析。
Microbiome. 2020 Aug 28;8(1):124. doi: 10.1186/s40168-020-00900-2.
5
CAMITAX: Taxon labels for microbial genomes.CAMITAX:微生物基因组的分类标签。
Gigascience. 2020 Jan 1;9(1). doi: 10.1093/gigascience/giz154.
6
Mash: fast genome and metagenome distance estimation using MinHash.Mash:使用MinHash进行快速的基因组和宏基因组距离估计。
Genome Biol. 2016 Jun 20;17(1):132. doi: 10.1186/s13059-016-0997-x.
7
SILVA, RDP, Greengenes, NCBI and OTT - how do these taxonomies compare?SILVA、RDP、Greengenes、NCBI和OTT——这些分类法如何比较?
BMC Genomics. 2017 Mar 14;18(Suppl 2):114. doi: 10.1186/s12864-017-3501-4.
8
The Influences of Bioinformatics Tools and Reference Databases in Analyzing the Human Oral Microbial Community.生物信息学工具和参考数据库在分析人类口腔微生物群落中的影响。
Genes (Basel). 2020 Aug 3;11(8):878. doi: 10.3390/genes11080878.
9
IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences.IDTAXA:一种用于微生物组序列准确分类的新方法。
Microbiome. 2018 Aug 9;6(1):140. doi: 10.1186/s40168-018-0521-5.
10
GSR-DB: a manually curated and optimized taxonomical database for 16S rRNA amplicon analysis.GSR-DB:一个用于16S rRNA扩增子分析的人工整理和优化的分类数据库。
mSystems. 2024 Feb 20;9(2):e0095023. doi: 10.1128/msystems.00950-23. Epub 2024 Jan 8.

引用本文的文献

1
Gut microbiota variability in dung beetles: prokaryotes vary according to the phylogeny of the host species while fungi vary according to the diet.蜣螂肠道微生物群的变异性:原核生物根据宿主物种的系统发育而变化,而真菌则根据饮食而变化。
Front Insect Sci. 2025 Aug 20;5:1639013. doi: 10.3389/finsc.2025.1639013. eCollection 2025.
2
New insights into the molecular biology of Alzheimer's-like cerebral amyloidosis achieved through multi-omics approaches.通过多组学方法对阿尔茨海默病样脑淀粉样变性分子生物学的新见解。
PLoS One. 2025 Sep 3;20(9):e0330859. doi: 10.1371/journal.pone.0330859. eCollection 2025.
3
In vitro fermentation characteristics of dietary fibers using fecal inocula from dogs treated with metronidazole.

本文引用的文献

1
Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups.基于 mash 的大肠杆菌基因组分析揭示了 14 个不同的系统发育群。
Commun Biol. 2021 Jan 26;4(1):117. doi: 10.1038/s42003-020-01626-5.
2
Measuring the microbiome: Best practices for developing and benchmarking microbiomics methods.微生物组测量:微生物组学方法开发与基准测试的最佳实践
Comput Struct Biotechnol J. 2020 Dec 3;18:4048-4062. doi: 10.1016/j.csbj.2020.11.049. eCollection 2020.
3
A total crapshoot? Evaluating bioinformatic decisions in animal diet metabarcoding analyses.
使用甲硝唑治疗的犬类粪便接种物对膳食纤维进行体外发酵特性研究。
Anim Microbiome. 2025 Sep 1;7(1):93. doi: 10.1186/s42523-025-00459-z.
4
Phosphorus Recovery From Wastewater Through Anaerobic Digestion Under Sub-Supersaturation Conditions.亚饱和条件下通过厌氧消化从废水中回收磷
Water Environ Res. 2025 Aug;97(8):e70161. doi: 10.1002/wer.70161.
5
The Gut Microbial Adaptation of Wild Goitered Gazelles Under Antibiotic Pressure in the Qaidam Basin.柴达木盆地野生鹅喉羚在抗生素压力下的肠道微生物适应性
Microorganisms. 2025 Aug 7;13(8):1842. doi: 10.3390/microorganisms13081842.
6
Metabarcoding Reveals Diversity of Potentially Toxic Algae in Papeete Port (Tahiti).宏条形码技术揭示了帕皮提港(塔希提岛)潜在有毒藻类的多样性。
Toxins (Basel). 2025 Aug 20;17(8):424. doi: 10.3390/toxins17080424.
7
Common garden experiments suggest terpene-mediated associations between phyllosphere microbes and Japanese cedar.常见园圃实验表明,萜烯介导了叶际微生物与日本柳杉之间的关联。
Sci Rep. 2025 Aug 21;15(1):30691. doi: 10.1038/s41598-025-16496-2.
8
Hydrodynamic activities and lifestyle preferences synergistically drive prokaryotic community assembly processes in the dual fronts system of the Yangtze River Estuary.水动力活动和生活方式偏好协同驱动长江河口双前沿系统中的原核生物群落组装过程。
Front Microbiol. 2025 Jul 31;16:1610617. doi: 10.3389/fmicb.2025.1610617. eCollection 2025.
9
Characterization of prokaryotic plankton community structure in the Southern East China Sea using combined 16S-rDNA and 16S-rRNA.利用16S核糖体DNA和16S核糖体RNA相结合的方法对东海南部原核浮游生物群落结构进行表征
Sci Rep. 2025 Aug 14;15(1):29896. doi: 10.1038/s41598-025-14272-w.
10
The effects of inulin supplementation on eating behaviours in children and adolescents with obesity: a randomized double-blinded placebo-controlled study.菊粉补充剂对肥胖儿童和青少年饮食行为的影响:一项随机双盲安慰剂对照研究。
Nutr Metab (Lond). 2025 Aug 12;22(1):97. doi: 10.1186/s12986-025-00995-0.
完全是碰运气?评估动物饮食代谢条形码分析中的生物信息学决策。
Ecol Evol. 2020 Jul 23;10(18):9721-9739. doi: 10.1002/ece3.6594. eCollection 2020 Sep.
4
NCBI Taxonomy: a comprehensive update on curation, resources and tools.NCBI 分类学:在管理、资源和工具方面的全面更新。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa062.
5
The Influences of Bioinformatics Tools and Reference Databases in Analyzing the Human Oral Microbial Community.生物信息学工具和参考数据库在分析人类口腔微生物群落中的影响。
Genes (Basel). 2020 Aug 3;11(8):878. doi: 10.3390/genes11080878.
6
Fungal species concepts in the genomics era.真菌种概念在基因组时代。
Genome. 2020 Sep;63(9):459-468. doi: 10.1139/gen-2020-0022. Epub 2020 Jun 12.
7
Construction of habitat-specific training sets to achieve species-level assignment in 16S rRNA gene datasets.构建特定生境的训练集,以实现 16S rRNA 基因数据集的种水平分类。
Microbiome. 2020 May 15;8(1):65. doi: 10.1186/s40168-020-00841-w.
8
The use of taxon-specific reference databases compromises metagenomic classification.使用分类群特异性参考数据库会影响宏基因组分类。
BMC Genomics. 2020 Feb 27;21(1):184. doi: 10.1186/s12864-020-6592-2.
9
No raw data, no science: another possible source of the reproducibility crisis.无原始数据,无科学:再现性危机的另一个可能来源。
Mol Brain. 2020 Feb 21;13(1):24. doi: 10.1186/s13041-020-0552-2.
10
PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding.PLANiTS:一个经过策展的植物 ITS DNA metabarcoding 序列参考数据集。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baz155.