Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, USA.
Bioinformatics. 2011 Jan 1;27(1):127-9. doi: 10.1093/bioinformatics/btq619. Epub 2010 Nov 8.
Datasets from high-throughput sequencing technologies have yielded a vast amount of data about organisms in environmental samples. Yet, it is still a challenge to assess the exact organism content in these samples because the task of taxonomic classification is too computationally complex to annotate all reads in a dataset. An easy-to-use webserver is needed to process these reads. While many methods exist, only a few are publicly available on webservers, and out of those, most do not annotate all reads.
We introduce a webserver that implements the naïve Bayes classifier (NBC) to classify all metagenomic reads to their best taxonomic match. Results indicate that NBC can assign next-generation sequencing reads to their taxonomic classification and can find significant populations of genera that other classifiers may miss.
Publicly available at: http://nbc.ece.drexel.edu.
高通量测序技术产生的数据集提供了大量关于环境样本中生物的信息。然而,由于分类学分类任务的计算复杂度太高,以至于无法注释数据集内的所有读取,因此仍然难以评估这些样本中的确切生物含量。需要一个易于使用的网络服务器来处理这些读取。虽然有许多方法,但只有少数在网络服务器上公开可用,而在这些方法中,大多数都无法注释所有读取。
我们引入了一个网络服务器,该服务器实现了朴素贝叶斯分类器 (NBC) 来将所有宏基因组读取分类为与其最佳分类匹配的分类。结果表明,NBC 可以将下一代测序读取分配给它们的分类学分类,并可以发现其他分类器可能遗漏的具有显著丰度的属。
可在以下网址公开获取:http://nbc.ece.drexel.edu.