Suppr超能文献

聚类 16S rRNA 进行 OTU 预测:一种无监督的贝叶斯聚类方法。

Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering.

机构信息

Department of Biology, University of Southern California, University Park, Los Angeles, CA 90089, USA.

出版信息

Bioinformatics. 2011 Mar 1;27(5):611-8. doi: 10.1093/bioinformatics/btq725. Epub 2011 Jan 13.

Abstract

MOTIVATION

With the advancements of next-generation sequencing technology, it is now possible to study samples directly obtained from the environment. Particularly, 16S rRNA gene sequences have been frequently used to profile the diversity of organisms in a sample. However, such studies are still taxed to determine both the number of operational taxonomic units (OTUs) and their relative abundance in a sample.

RESULTS

To address these challenges, we propose an unsupervised Bayesian clustering method termed Clustering 16S rRNA for OTU Prediction (CROP). CROP can find clusters based on the natural organization of data without setting a hard cut-off threshold (3%/5%) as required by hierarchical clustering methods. By applying our method to several datasets, we demonstrate that CROP is robust against sequencing errors and that it produces more accurate results than conventional hierarchical clustering methods.

AVAILABILITY AND IMPLEMENTATION

Source code freely available at the following URL: http://code.google.com/p/crop-tingchenlab/, implemented in C++ and supported on Linux and MS Windows.

摘要

动机

随着下一代测序技术的进步,现在可以直接研究从环境中获得的样本。特别是,16S rRNA 基因序列经常被用于分析样本中生物的多样性。然而,这些研究仍然需要确定样本中的操作分类单元 (OTUs) 的数量及其相对丰度。

结果

为了解决这些挑战,我们提出了一种无监督的贝叶斯聚类方法,称为聚类 16S rRNA 用于 OTU 预测 (CROP)。CROP 可以根据数据的自然组织找到聚类,而不需要像层次聚类方法那样设置硬性的截止阈值 (3%/5%)。通过将我们的方法应用于几个数据集,我们证明了 CROP 对测序错误具有鲁棒性,并且比传统的层次聚类方法产生更准确的结果。

可用性和实现

源代码可在以下网址免费获得:http://code.google.com/p/crop-tingchenlab/,用 C++实现,支持 Linux 和 MS Windows。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bfe2/3042185/82405f20cf9e/btq725f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验