• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TaxAss:利用自定义淡水数据库实现精细分类学分辨率。

TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution.

机构信息

Environmental Chemistry and Technology Program, University of Wisconsin-Madison, Madison, Wisconsin, USA

Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA.

出版信息

mSphere. 2018 Sep 5;3(5):e00327-18. doi: 10.1128/mSphere.00327-18.

DOI:10.1128/mSphere.00327-18
PMID:30185512
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6126143/
Abstract

Taxonomy assignment of freshwater microbial communities is limited by the minimally curated phylogenies used for large taxonomy databases. Here we introduce TaxAss, a taxonomy assignment workflow that classifies 16S rRNA gene amplicon data using two taxonomy reference databases: a large comprehensive database and a small ecosystem-specific database rigorously curated by scientists within a field. We applied TaxAss to five different freshwater data sets using the comprehensive SILVA database and the freshwater-specific FreshTrain database. TaxAss increased the percentage of the data set classified compared to using only SILVA, especially at fine-resolution family to species taxon levels, while across the freshwater test data sets classifications increased by as much as 11 to 40% of total reads. A similar increase in classifications was not observed in a control mouse gut data set, which was not expected to contain freshwater bacteria. TaxAss also maintained taxonomic richness compared to using only the FreshTrain across all taxon levels from phylum to species. Without TaxAss, most organisms not represented in the FreshTrain were unclassified, but at fine taxon levels, incorrect classifications became significant. We validated TaxAss using simulated amplicon data derived from full-length clone libraries and found that 96 to 99% of test sequences were correctly classified at fine resolution. TaxAss splits a data set's sequences into two groups based on their percent identity to reference sequences in the ecosystem-specific database. Sequences with high similarity to sequences in the ecosystem-specific database are classified using that database, and the others are classified using the comprehensive database. TaxAss is free and open source and is available at https://www.github.com/McMahonLab/TaxAss Microbial communities drive ecosystem processes, but microbial community composition analyses using 16S rRNA gene amplicon data sets are limited by the lack of fine-resolution taxonomy classifications. Coarse taxonomic groupings at the phylum, class, and order levels lump ecologically distinct organisms together. To avoid this, many researchers define operational taxonomic units (OTUs) based on clustered sequences, sequence variants, or unique sequences. These fine-resolution groupings are more ecologically relevant, but OTU definitions are data set dependent and cannot be compared between data sets. Microbial ecologists studying freshwater have curated a small, ecosystem-specific taxonomy database to provide consistent and up-to-date terminology. We created TaxAss, a workflow that leverages this database to assign taxonomy. We found that TaxAss improves fine-resolution taxonomic classifications (family, genus, and species). Fine taxonomic groupings are more ecologically relevant, so they provide an alternative to OTU-based analyses that is consistent and comparable between data sets.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/731f178a7c75/sph0041826260005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/ccd09f0902fb/sph0041826260001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/476e1b0f8bd0/sph0041826260002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/12fb830b48f8/sph0041826260003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/d8768a369f99/sph0041826260004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/731f178a7c75/sph0041826260005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/ccd09f0902fb/sph0041826260001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/476e1b0f8bd0/sph0041826260002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/12fb830b48f8/sph0041826260003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/d8768a369f99/sph0041826260004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f827/6126143/731f178a7c75/sph0041826260005.jpg
摘要

淡水微生物群落的分类学分配受到用于大型分类学数据库的最小管理系统发育树的限制。在这里,我们介绍了 TaxAss,这是一种分类学分配工作流程,它使用两个分类学参考数据库对 16S rRNA 基因扩增子数据进行分类:一个大型综合数据库和一个由该领域科学家严格管理的小型生态系统特定数据库。我们使用综合 SILVA 数据库和淡水特定的 FreshTrain 数据库将 TaxAss 应用于五个不同的淡水数据集。与仅使用 SILVA 相比,TaxAss 增加了数据集分类的百分比,特别是在精细分辨率的科到种分类群水平,而在整个淡水测试数据集分类中,增加了多达 11%至 40%的总读数。在一个不期望包含淡水细菌的对照鼠肠数据集上,没有观察到类似的分类增加。TaxAss 还保持了与在所有分类群水平(从门到种)仅使用 FreshTrain 相比的分类丰富度。如果没有 TaxAss,大多数不在 FreshTrain 中表示的生物体都未被分类,但在精细分类群水平上,不正确的分类变得很重要。我们使用源自全长克隆文库的模拟扩增子数据验证了 TaxAss,发现 96%至 99%的测试序列在精细分辨率下得到正确分类。TaxAss 根据其与生态系统特定数据库中参考序列的百分比身份将数据集的序列分为两组。与生态系统特定数据库中的序列具有高相似性的序列使用该数据库进行分类,而其他序列则使用综合数据库进行分类。TaxAss 是免费的开源软件,可在 https://www.github.com/McMahonLab/TaxAss 上获得。微生物群落驱动生态系统过程,但使用 16S rRNA 基因扩增子数据集进行微生物群落组成分析受到缺乏精细分辨率分类学分类的限制。在门、纲和目等粗分类群水平上,将具有生态差异的生物体混在一起。为了避免这种情况,许多研究人员基于聚类序列、序列变体或独特序列定义操作分类单位 (OTU)。这些精细分辨率的分组更具生态相关性,但 OTU 定义是数据集特定的,并且不能在数据集之间进行比较。研究淡水的微生物生态学家已经整理了一个小型的、生态系统特定的分类学数据库,以提供一致的、最新的术语。我们创建了 TaxAss,这是一种利用该数据库进行分类的工作流程。我们发现 TaxAss 提高了精细分辨率的分类学分类(科、属和种)。精细的分类群更具生态相关性,因此它们提供了一种替代基于 OTU 的分析方法,这种方法在数据集之间是一致和可比较的。

相似文献

1
TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution.TaxAss:利用自定义淡水数据库实现精细分类学分辨率。
mSphere. 2018 Sep 5;3(5):e00327-18. doi: 10.1128/mSphere.00327-18.
2
Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax).通过高通量全长 16S rRNA 基因测序和自动化分类学分配(AutoTax)生成具有物种水平分辨率的综合生态系统特异性参考数据库。
mBio. 2020 Sep 22;11(5):e01557-20. doi: 10.1128/mBio.01557-20.
3
Metagenomic profiling of the microbial freshwater communities in two Bulgarian reservoirs.对保加利亚两个水库中微生物淡水群落的宏基因组分析。
J Basic Microbiol. 2017 Aug;57(8):669-679. doi: 10.1002/jobm.201700137. Epub 2017 May 22.
4
Broadscale Ecological Patterns Are Robust to Use of Exact Sequence Variants versus Operational Taxonomic Units.广义生态模式对精确序列变异与操作分类单位的使用具有稳健性。
mSphere. 2018 Jul 18;3(4):e00148-18. doi: 10.1128/mSphere.00148-18.
5
Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database.通过专用参考数据库改进人类肠道16S rRNA序列的分类学归属
BMC Genomics. 2015 Dec 12;16:1056. doi: 10.1186/s12864-015-2265-y.
6
GSR-DB: a manually curated and optimized taxonomical database for 16S rRNA amplicon analysis.GSR-DB:一个用于16S rRNA扩增子分析的人工整理和优化的分类数据库。
mSystems. 2024 Feb 20;9(2):e0095023. doi: 10.1128/msystems.00950-23. Epub 2024 Jan 8.
7
DAIRYdb: a manually curated reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products.乳制品数据库(DAIRYdb):一个经过人工整理的参考数据库,用于改进乳制品 16S rRNA 基因序列的分类注释。
BMC Genomics. 2019 Jul 8;20(1):560. doi: 10.1186/s12864-019-5914-8.
8
Amplicon Sequence Variants Artificially Split Bacterial Genomes into Separate Clusters.扩增子序列变异将细菌基因组人为地分成单独的聚类。
mSphere. 2021 Aug 25;6(4):e0019121. doi: 10.1128/mSphere.00191-21. Epub 2021 Jul 21.
9
bioOTU: An Improved Method for Simultaneous Taxonomic Assignments and Operational Taxonomic Units Clustering of 16s rRNA Gene Sequences.生物OTU:一种用于16S rRNA基因序列分类分配和操作分类单元聚类的改进方法。
J Comput Biol. 2016 Apr;23(4):229-38. doi: 10.1089/cmb.2015.0214. Epub 2016 Mar 7.
10
Construction & assessment of a unified curated reference database for improving the taxonomic classification of bacteria using 16S rRNA sequence data.构建和评估统一的经过精心整理的参考数据库,以提高使用 16S rRNA 序列数据的细菌分类学分类。
Indian J Med Res. 2020 Jan;151(1):93-103. doi: 10.4103/ijmr.IJMR_220_18.

引用本文的文献

1
Genomes of nitrogen-fixing eukaryotes reveal an alternate path for organellogenesis.固氮真核生物的基因组揭示了一条细胞器发生的替代途径。
Proc Natl Acad Sci U S A. 2025 Aug 19;122(33):e2507237122. doi: 10.1073/pnas.2507237122. Epub 2025 Aug 12.
2
Distribution of the four type VI secretion systems in Pseudomonas aeruginosa and classification of their core and accessory effectors.铜绿假单胞菌中四种VI型分泌系统的分布及其核心效应蛋白和辅助效应蛋白的分类
Nat Commun. 2025 Jan 21;16(1):888. doi: 10.1038/s41467-024-54649-5.
3
Microbiota Composition Associates With Mosquito Productivity Outcomes in Belowground Larval Habitats.

本文引用的文献

1
Retrieval of a million high-quality, full-length microbial 16S and 18S rRNA gene sequences without primer bias.无引物偏好的百万高质量、全长微生物 16S 和 18S rRNA 基因序列的检索。
Nat Biotechnol. 2018 Feb;36(2):190-195. doi: 10.1038/nbt.4045. Epub 2018 Jan 1.
2
Ananke: temporal clustering reveals ecological dynamics of microbial communities.阿南刻:时间聚类揭示了微生物群落的生态动态。
PeerJ. 2017 Sep 26;5:e3812. doi: 10.7717/peerj.3812. eCollection 2017.
3
Metabolic Network Analysis and Metatranscriptomics Reveal Auxotrophies and Nutrient Sources of the Cosmopolitan Freshwater Microbial Lineage acI.
微生物群组成与地下幼虫栖息地的蚊子繁殖结果相关。
Mol Ecol. 2025 Jan;34(2):e17614. doi: 10.1111/mec.17614. Epub 2024 Dec 13.
4
High accuracy meets high throughput for near full-length 16S ribosomal RNA amplicon sequencing on the Nanopore platform.在纳米孔平台上进行近全长16S核糖体RNA扩增子测序时,高精度与高通量得以兼顾。
PNAS Nexus. 2024 Oct 9;3(10):pgae411. doi: 10.1093/pnasnexus/pgae411. eCollection 2024 Oct.
5
Genomes of nitrogen-fixing eukaryotes reveal a non-canonical model of organellogenesis.固氮真核生物的基因组揭示了一种非经典的细胞器发生模式。
bioRxiv. 2025 Feb 18:2024.08.27.609708. doi: 10.1101/2024.08.27.609708.
6
Microbial ecology of northern Gulf of Mexico estuarine waters.墨西哥湾北部河口水域的微生物生态学。
mSystems. 2024 Aug 20;9(8):e0131823. doi: 10.1128/msystems.01318-23. Epub 2024 Jul 9.
7
Description of a 'plankton filtration bias' in sequencing-based bacterial community analysis and of an Arduino microcontroller-based flowmeter device that can help to resolve it.基于测序的细菌群落分析中的“浮游生物过滤偏倚”描述,以及一种基于 Arduino 微控制器的流量计装置,该装置可以帮助解决这个问题。
PLoS One. 2024 May 28;19(5):e0303937. doi: 10.1371/journal.pone.0303937. eCollection 2024.
8
Evolving understanding of rumen methanogen ecophysiology.对瘤胃产甲烷菌生态生理学的认识不断发展。
Front Microbiol. 2023 Nov 6;14:1296008. doi: 10.3389/fmicb.2023.1296008. eCollection 2023.
9
Trophic interactions shape the spatial organization of medium-chain carboxylic acid producing granular biofilm communities.营养相互作用塑造了中链羧酸产生颗粒生物膜群落的空间组织。
ISME J. 2023 Nov;17(11):2014-2022. doi: 10.1038/s41396-023-01508-8. Epub 2023 Sep 15.
10
In the right place, at the right time: the integration of bacteria into the Plankton Ecology Group model.在适当的时间和地点:将细菌整合到浮游生物生态群模型中。
Microbiome. 2023 May 20;11(1):112. doi: 10.1186/s40168-023-01522-0.
代谢网络分析和宏转录组学揭示了全球淡水微生物谱系acI的营养缺陷型和营养来源。
mSystems. 2017 Aug 29;2(4). doi: 10.1128/mSystems.00091-17. eCollection 2017 Jul-Aug.
4
Bacterial Community Composition and Dynamics Spanning Five Years in Freshwater Bog Lakes.淡水沼泽湖泊中跨越五年的细菌群落组成与动态变化
mSphere. 2017 Jun 28;2(3). doi: 10.1128/mSphere.00169-17. eCollection 2017 May-Jun.
5
Evaluation of PacBio sequencing for full-length bacterial 16S rRNA gene classification.用于全长细菌16S rRNA基因分类的PacBio测序评估
BMC Microbiol. 2016 Nov 14;16(1):274. doi: 10.1186/s12866-016-0891-4.
6
Improved Bacterial 16S rRNA Gene (V4 and V4-5) and Fungal Internal Transcribed Spacer Marker Gene Primers for Microbial Community Surveys.用于微生物群落调查的改良细菌16S rRNA基因(V4和V4-5)及真菌内转录间隔区标记基因引物
mSystems. 2015 Dec 22;1(1). doi: 10.1128/mSystems.00009-15. eCollection 2016 Jan-Feb.
7
VSEARCH: a versatile open source tool for metagenomics.VSEARCH:一款用于宏基因组学的多功能开源工具。
PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.
8
Millions of reads, thousands of taxa: microbial community structure and associations analyzed via marker genes.数以百万计的reads,数千个分类群:通过标记基因分析微生物群落结构和关联。
FEMS Microbiol Rev. 2016 Sep;40(5):686-700. doi: 10.1093/femsre/fuw017. Epub 2016 Jun 29.
9
DADA2: High-resolution sample inference from Illumina amplicon data.DADA2:从Illumina扩增子数据进行高分辨率样本推断。
Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23.
10
Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.使用PacBio SMRT DNA测序系统对16S rRNA基因片段进行测序。
PeerJ. 2016 Mar 28;4:e1869. doi: 10.7717/peerj.1869. eCollection 2016.