• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于16S rRNA基因序列的贝叶斯分类方法,具有更高的物种水平准确性。

A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

作者信息

Gao Xiang, Lin Huaiying, Revanna Kashi, Dong Qunfeng

机构信息

Department of Public Health Sciences, Loyola University Chicago Health Sciences Division, Maywood, IL, 60153, USA.

Center for Biomedical Informatics, Loyola University Chicago Health Sciences Division, Maywood, IL, 60153, USA.

出版信息

BMC Bioinformatics. 2017 May 10;18(1):247. doi: 10.1186/s12859-017-1670-4.

DOI:10.1186/s12859-017-1670-4
PMID:28486927
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5424349/
Abstract

BACKGROUND

Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement.

RESULTS

We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes.

CONCLUSIONS

Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .

摘要

背景

对于微生物组研究人员而言,16S rRNA基因序列的物种水平分类仍然是一项严峻挑战,因为现有的16S rRNA基因序列分类工具要么不提供物种水平分类,要么其分类结果不可靠。结果不可靠是由于现有方法存在局限性,这些方法要么缺乏基于可靠概率的标准来评估其分类归属的可信度,要么使用核苷酸k-mer频率作为序列相似性测量的替代指标。

结果

我们开发了一种方法,与现有方法相比,该方法在物种水平分类结果上有显著改进。我们的方法使用成对序列比对来计算查询序列与数据库匹配序列之间的真实序列相似性。基于每个查询序列的多个数据库匹配的最低共同祖先,从物种到门水平进行分类归属,并通过自展置信度得分评估进一步的分类可靠性。我们方法的新颖之处在于,每个数据库匹配对查询序列分类归属的贡献通过基于数据库匹配与查询序列的序列相似性程度的贝叶斯后验概率进行加权。我们的方法不需要针对不同分类组的任何训练数据集。相反,只需要一个参考数据库来与查询序列进行比对,这使得我们的方法易于应用于16S rRNA基因的不同区域或其他系统发育标记基因。

结论

对16S rRNA或其他系统发育标记基因进行可靠的物种水平分类对于微生物组研究至关重要。我们的软件显示出比现有工具显著更高的分类准确性,并且我们基于多个数据库与查询序列的匹配提供基于概率的置信度得分,以评估我们分类归属的可靠性。尽管计算成本较高,但我们的方法仍然适用于实际分析大规模微生物组数据集。此外,我们的方法可应用于任何系统发育标记基因序列的分类。我们的软件名为BLCA,可在https://github.com/qunfengdong/BLCA上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e27/5424349/8855da7f5c80/12859_2017_1670_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e27/5424349/8855da7f5c80/12859_2017_1670_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e27/5424349/8855da7f5c80/12859_2017_1670_Fig1_HTML.jpg

相似文献

1
A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.一种用于16S rRNA基因序列的贝叶斯分类方法,具有更高的物种水平准确性。
BMC Bioinformatics. 2017 May 10;18(1):247. doi: 10.1186/s12859-017-1670-4.
2
Construction of habitat-specific training sets to achieve species-level assignment in 16S rRNA gene datasets.构建特定生境的训练集,以实现 16S rRNA 基因数据集的种水平分类。
Microbiome. 2020 May 15;8(1):65. doi: 10.1186/s40168-020-00841-w.
3
TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution.TaxAss:利用自定义淡水数据库实现精细分类学分辨率。
mSphere. 2018 Sep 5;3(5):e00327-18. doi: 10.1128/mSphere.00327-18.
4
RNA polymerase beta subunit (rpoB) gene and the 16S-23S rRNA intergenic transcribed spacer region (ITS) as complementary molecular markers in addition to the 16S rRNA gene for phylogenetic analysis and identification of the species of the family Mycoplasmataceae.在系统发育分析和鉴定支原体科物种时,除了 16S rRNA 基因外,还使用 RNA 聚合酶β亚基(rpoB)基因和 16S-23S rRNA 基因间转录间隔区(ITS)作为互补的分子标记。
Mol Phylogenet Evol. 2012 Jan;62(1):515-28. doi: 10.1016/j.ympev.2011.11.002. Epub 2011 Nov 17.
5
bioOTU: An Improved Method for Simultaneous Taxonomic Assignments and Operational Taxonomic Units Clustering of 16s rRNA Gene Sequences.生物OTU:一种用于16S rRNA基因序列分类分配和操作分类单元聚类的改进方法。
J Comput Biol. 2016 Apr;23(4):229-38. doi: 10.1089/cmb.2015.0214. Epub 2016 Mar 7.
6
Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.利用 QIIME 2 的 q2-feature-classifier 插件优化标记基因扩增子序列的分类学分类。
Microbiome. 2018 May 17;6(1):90. doi: 10.1186/s40168-018-0470-z.
7
Construction & assessment of a unified curated reference database for improving the taxonomic classification of bacteria using 16S rRNA sequence data.构建和评估统一的经过精心整理的参考数据库,以提高使用 16S rRNA 序列数据的细菌分类学分类。
Indian J Med Res. 2020 Jan;151(1):93-103. doi: 10.4103/ijmr.IJMR_220_18.
8
SpeciateIT and vSpeciateDB: novel, fast, and accurate per sequence 16S rRNA gene taxonomic classification of vaginal microbiota.SpeciateIT 和 vSpeciateDB:一种新型、快速且准确的基于 16S rRNA 基因序列的阴道微生物群落分类方法。
BMC Bioinformatics. 2024 Sep 27;25(1):313. doi: 10.1186/s12859-024-05930-3.
9
Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database.通过专用参考数据库改进人类肠道16S rRNA序列的分类学归属
BMC Genomics. 2015 Dec 12;16:1056. doi: 10.1186/s12864-015-2265-y.
10
Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.用于将rRNA序列快速分类到新细菌分类学中的朴素贝叶斯分类器。
Appl Environ Microbiol. 2007 Aug;73(16):5261-7. doi: 10.1128/AEM.00062-07. Epub 2007 Jun 22.

引用本文的文献

1
Evidence for an indigenous female mouse urobiome.关于本土雌性小鼠泌尿微生物群的证据。
bioRxiv. 2025 Aug 23:2025.08.20.671418. doi: 10.1101/2025.08.20.671418.
2
Benchmarking 16S rRNA Gene-Based Approaches to Bacterial Taxonomy Assignment Based on Amplicon Sequencing With Illumina and Oxford Nanopore.基于Illumina和Oxford Nanopore扩增子测序的16S rRNA基因细菌分类学分配方法的基准测试
Int J Microbiol. 2025 Aug 13;2025:7563096. doi: 10.1155/ijm/7563096. eCollection 2025.
3
Abundance and Diversity of Aerobic Anoxygenic Phototrophic Bacteria in Polar Plant Microbiomes.

本文引用的文献

1
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP.系统发育树的置信区间:一种使用自展法的方法。
Evolution. 1985 Jul;39(4):783-791. doi: 10.1111/j.1558-5646.1985.tb00420.x.
2
Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).使用精确比对(Kraken)通过超快速宏基因组序列分类研究长16S rDNA序列。
J Microbiol Methods. 2016 Mar;122:38-42. doi: 10.1016/j.mimet.2016.01.011. Epub 2016 Jan 23.
3
An evaluation of the accuracy and speed of metagenome analysis tools.
极地植物微生物群落中需氧不产氧光合细菌的丰度和多样性
Physiol Plant. 2025 Jul-Aug;177(4):e70441. doi: 10.1111/ppl.70441.
4
Mitochondrial markers ( and ) as supporting biomarkers for wild bird identification.线粒体标记物(和)作为野生鸟类识别的辅助生物标志物。
Vet World. 2025 May;18(5):1389-1399. doi: 10.14202/vetworld.2025.1389-1399. Epub 2025 May 31.
5
ACE inhibitory casein peptide lowers blood pressure and reshapes gut microbiota in a randomized double blind placebo controlled trial.在一项随机双盲安慰剂对照试验中,具有血管紧张素转换酶抑制活性的酪蛋白肽可降低血压并重塑肠道微生物群。
Sci Rep. 2025 Apr 22;15(1):13840. doi: 10.1038/s41598-025-98446-6.
6
Soil biome variation of Lupinus nipomensis in wet-cool vs. dry-warm microhabitats and greenhouse.湿润凉爽与干燥温暖微生境及温室中尼波美 lupinus nipomensis 的土壤生物群落变化。
Am J Bot. 2025 Apr;112(4):e70020. doi: 10.1002/ajb2.70020. Epub 2025 Mar 21.
7
Bacterial Supplements Significantly Improve the Growth Rate of Cultured Asparagopsis armata.细菌补充剂显著提高了养殖的龙须菜的生长速度。
Mar Biotechnol (NY). 2025 Mar 14;27(2):65. doi: 10.1007/s10126-025-10440-1.
8
Streptococcus lutetiensis inhibits CD8 IL17A TRM cells and leads to gastric cancer progression and poor prognosis.卢特链球菌抑制CD8 IL17A组织驻留记忆细胞,导致胃癌进展和预后不良。
NPJ Precis Oncol. 2025 Feb 9;9(1):43. doi: 10.1038/s41698-025-00810-2.
9
Metagenomic analysis and bioactive profiling of kombucha fermentation: antioxidant, antibacterial activities, and molecular docking insights into gastric cancer therapeutics.康普茶发酵的宏基因组分析与生物活性剖析:抗氧化、抗菌活性以及对胃癌治疗的分子对接见解
Toxicol Res (Camb). 2024 Dec 21;13(6):tfae224. doi: 10.1093/toxres/tfae224. eCollection 2024 Dec.
10
Integrative multi-omics analysis uncovers tumor-immune-gut axis influencing immunotherapy outcomes in ovarian cancer.整合多组学分析揭示影响卵巢癌免疫治疗结果的肿瘤-免疫-肠道轴。
Nat Commun. 2024 Dec 5;15(1):10609. doi: 10.1038/s41467-024-54565-8.
宏基因组分析工具的准确性和速度评估。
Sci Rep. 2016 Jan 18;6:19233. doi: 10.1038/srep19233.
4
SPINGO: a rapid species-classifier for microbial amplicon sequences.SPINGO:一种用于微生物扩增子序列的快速物种分类器。
BMC Bioinformatics. 2015 Oct 8;16:324. doi: 10.1186/s12859-015-0747-1.
5
16S classifier: a tool for fast and accurate taxonomic classification of 16S rRNA hypervariable regions in metagenomic datasets.16S分类器:一种用于对宏基因组数据集中16S rRNA高变区进行快速准确分类的工具。
PLoS One. 2015 Feb 3;10(2):e0116106. doi: 10.1371/journal.pone.0116106. eCollection 2015.
6
Diarrhea in young children from low-income countries leads to large-scale alterations in intestinal microbiota composition.低收入国家幼儿的腹泻会导致肠道微生物群组成发生大规模改变。
Genome Biol. 2014 Jun 27;15(6):R76. doi: 10.1186/gb-2014-15-6-r76.
7
Kraken: ultrafast metagenomic sequence classification using exact alignments.克拉肯:使用精确比对的超快速宏基因组序列分类
Genome Biol. 2014 Mar 3;15(3):R46. doi: 10.1186/gb-2014-15-3-r46.
8
Species-level classification of the vaginal microbiome.阴道微生物组的种水平分类。
BMC Genomics. 2012;13 Suppl 8(Suppl 8):S17. doi: 10.1186/1471-2164-13-S8-S17. Epub 2012 Dec 17.
9
A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers.三代测序平台的故事:Ion Torrent、Pacific Biosciences 和 Illumina MiSeq 测序仪的比较。
BMC Genomics. 2012 Jul 24;13:341. doi: 10.1186/1471-2164-13-341.
10
QIIME allows analysis of high-throughput community sequencing data.QIIME可用于分析高通量群落测序数据。
Nat Methods. 2010 May;7(5):335-6. doi: 10.1038/nmeth.f.303. Epub 2010 Apr 11.