Suppr
超能文献

聚类 16S rRNA 进行 OTU 预测：一种无监督的贝叶斯聚类方法。

Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering.

机构信息

Department of Biology, University of Southern California, University Park, Los Angeles, CA 90089, USA.

出版信息

Bioinformatics. 2011 Mar 1;27(5):611-8. doi: 10.1093/bioinformatics/btq725. Epub 2011 Jan 13.

DOI:10.1093/bioinformatics/btq725

PMID:21233169

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3042185/

Abstract

MOTIVATION

With the advancements of next-generation sequencing technology, it is now possible to study samples directly obtained from the environment. Particularly, 16S rRNA gene sequences have been frequently used to profile the diversity of organisms in a sample. However, such studies are still taxed to determine both the number of operational taxonomic units (OTUs) and their relative abundance in a sample.

RESULTS

To address these challenges, we propose an unsupervised Bayesian clustering method termed Clustering 16S rRNA for OTU Prediction (CROP). CROP can find clusters based on the natural organization of data without setting a hard cut-off threshold (3%/5%) as required by hierarchical clustering methods. By applying our method to several datasets, we demonstrate that CROP is robust against sequencing errors and that it produces more accurate results than conventional hierarchical clustering methods.

AVAILABILITY AND IMPLEMENTATION

Source code freely available at the following URL: http://code.google.com/p/crop-tingchenlab/, implemented in C++ and supported on Linux and MS Windows.

摘要

动机

随着下一代测序技术的进步，现在可以直接研究从环境中获得的样本。特别是，16S rRNA 基因序列经常被用于分析样本中生物的多样性。然而，这些研究仍然需要确定样本中的操作分类单元 (OTUs) 的数量及其相对丰度。

结果

为了解决这些挑战，我们提出了一种无监督的贝叶斯聚类方法，称为聚类 16S rRNA 用于 OTU 预测 (CROP)。CROP 可以根据数据的自然组织找到聚类，而不需要像层次聚类方法那样设置硬性的截止阈值 (3%/5%)。通过将我们的方法应用于几个数据集，我们证明了 CROP 对测序错误具有鲁棒性，并且比传统的层次聚类方法产生更准确的结果。

可用性和实现

源代码可在以下网址免费获得：http://code.google.com/p/crop-tingchenlab/，用 C++实现，支持 Linux 和 MS Windows。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bfe2/3042185/82405f20cf9e/btq725f1.jpg

相似文献

Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering.

Bioinformatics. 2011 Mar 1;27(5):611-8. doi: 10.1093/bioinformatics/btq725. Epub 2011 Jan 13.

Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering.

Microbiome. 2015 Oct 5;3:43. doi: 10.1186/s40168-015-0105-6.

A comparison of methods for clustering 16S rRNA sequences into OTUs.

PLoS One. 2013 Aug 13;8(8):e70837. doi: 10.1371/journal.pone.0070837. eCollection 2013.

DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs.

J Theor Biol. 2017 Jul 21;425:80-87. doi: 10.1016/j.jtbi.2017.04.019. Epub 2017 Apr 26.

DMclust, a Density-based Modularity Method for Accurate OTU Picking of 16S rRNA Sequences.

Mol Inform. 2017 Dec;36(12). doi: 10.1002/minf.201600059. Epub 2017 Jun 6.

Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis.

Appl Environ Microbiol. 2011 May;77(10):3219-26. doi: 10.1128/AEM.02810-10. Epub 2011 Mar 18.

bioOTU: An Improved Method for Simultaneous Taxonomic Assignments and Operational Taxonomic Units Clustering of 16s rRNA Gene Sequences.

J Comput Biol. 2016 Apr;23(4):229-38. doi: 10.1089/cmb.2015.0214. Epub 2016 Mar 7.

M-pick, a modularity-based method for OTU picking of 16S rRNA sequences.

BMC Bioinformatics. 2013 Feb 7;14:43. doi: 10.1186/1471-2105-14-43.

A De Novo Robust Clustering Approach for Amplicon-Based Sequence Data.

J Comput Biol. 2019 Jun;26(6):618-624. doi: 10.1089/cmb.2018.0170. Epub 2018 Dec 5.

BOTUX: bayesian-like operational taxonomic unit examiner.

Int J Comput Biol Drug Des. 2014;7(2-3):130-45. doi: 10.1504/IJCBDD.2014.061652. Epub 2014 May 28.

引用本文的文献

Delimiting Species-Prospects and Challenges for DNA Barcoding.

Mol Ecol. 2025 Mar;34(5):e17677. doi: 10.1111/mec.17677. Epub 2025 Feb 6.

Insights into the microbiota of raw milk from seven breeds animals distributing in Xinjiang China.

Front Microbiol. 2024 Oct 23;15:1382286. doi: 10.3389/fmicb.2024.1382286. eCollection 2024.

Responses of the coral reef cryptobiome to environmental gradients in the Red Sea.

PLoS One. 2024 Apr 16;19(4):e0301837. doi: 10.1371/journal.pone.0301837. eCollection 2024.

A toolbox of machine learning software to support microbiome analysis.

Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023.

DNA barcoding and morphology reveal European and western Asian (Linnaeus, 1758) as a complex of species (Lepidoptera, Erebidae, Arctiinae).

Zookeys. 2023 Apr 25;1159:69-86. doi: 10.3897/zookeys.1159.95225. eCollection 2023.

Exceptional larval morphology of nine species of the species group (Diptera, Tephritidae).

Zookeys. 2022 Nov 3;1127:155-215. doi: 10.3897/zookeys.1127.84628. eCollection 2022.

Inter-annual variability patterns of reef cryptobiota in the central Red Sea across a shelf gradient.

Sci Rep. 2022 Oct 9;12(1):16944. doi: 10.1038/s41598-022-21304-2.

Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors.

BMC Microbiol. 2022 Jan 3;22(1):4. doi: 10.1186/s12866-021-02414-9.

Comparison of Methods for Picking the Operational Taxonomic Units From Amplicon Sequences.

Front Microbiol. 2021 Mar 24;12:644012. doi: 10.3389/fmicb.2021.644012. eCollection 2021.

To denoise or to cluster, that is not the question: optimizing pipelines for COI metabarcoding and metaphylogeography.

BMC Bioinformatics. 2021 Apr 5;22(1):177. doi: 10.1186/s12859-021-04115-6.

本文引用的文献

The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies.

PLoS Comput Biol. 2010 Jul 8;6(7):e1000844. doi: 10.1371/journal.pcbi.1000844.

Ironing out the wrinkles in the rare biosphere through improved OTU clustering.

Environ Microbiol. 2010 Jul;12(7):1889-98. doi: 10.1111/j.1462-2920.2010.02193.x. Epub 2010 Mar 11.

Bacterial community variation in human body habitats across space and time.

Science. 2009 Dec 18;326(5960):1694-7. doi: 10.1126/science.1177486. Epub 2009 Nov 5.

Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities.

Appl Environ Microbiol. 2009 Dec;75(23):7537-41. doi: 10.1128/AEM.01541-09. Epub 2009 Oct 2.

Accurate determination of microbial diversity from 454 pyrosequencing data.

Nat Methods. 2009 Sep;6(9):639-41. doi: 10.1038/nmeth.1361. Epub 2009 Aug 9.

Topographical and temporal diversity of the human skin microbiome.

Science. 2009 May 29;324(5931):1190-2. doi: 10.1126/science.1171700.

ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences.

Nucleic Acids Res. 2009 Jun;37(10):e76. doi: 10.1093/nar/gkp285. Epub 2009 May 5.

The Ribosomal Database Project: improved alignments and new tools for rRNA analysis.

Nucleic Acids Res. 2009 Jan;37(Database issue):D141-5. doi: 10.1093/nar/gkn879. Epub 2008 Nov 12.

The development and impact of 454 sequencing.

Nat Biotechnol. 2008 Oct;26(10):1117-24. doi: 10.1038/nbt1485.

Efficient functional clustering of protein sequences using the Dirichlet process.

Bioinformatics. 2008 Aug 15;24(16):1765-71. doi: 10.1093/bioinformatics/btn244. Epub 2008 May 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

聚类 16S rRNA 进行 OTU 预测：一种无监督的贝叶斯聚类方法。

Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译