Department of Entomology, The Ohio State University-Ohio Agricultural Research and Development Center, 1680 Madison Ave., Wooster, OH, 44691, USA.
Department of Infectious Diseases, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10, Göteborg, SE-413 46, Sweden.
Mol Ecol Resour. 2017 Jul;17(4):760-769. doi: 10.1111/1755-0998.12628. Epub 2016 Nov 21.
The taxonomic classification of DNA sequences has become a critical component of numerous ecological research applications; however, few studies have evaluated the strengths and weaknesses of commonly used sequence classification approaches. Further, the methods and software available for sequence classification are diverse, creating an environment in which it may be difficult to determine the best course of action and the trade-offs made using different classification approaches. Here, we provide an in silico evaluation of three DNA sequence classifiers, the rdp Naïve Bayesian Classifier, rtax and utax. Further, we discuss the results, merits and limitations of both the classifiers and our method of classifier evaluation. Our methods of comparison are simple, yet robust, and will provide researchers a methodological and conceptual foundation for making such evaluations in a variety of research situations. Generally, we found a considerable trade-off between accuracy and sensitivity for the classifiers tested, indicating a need for further improvement of sequence classification tools.
DNA 序列的分类学分类已经成为许多生态研究应用的关键组成部分;然而,很少有研究评估常用序列分类方法的优缺点。此外,用于序列分类的方法和软件多种多样,这使得很难确定最佳行动方案以及使用不同分类方法所做出的权衡。在这里,我们对三种 DNA 序列分类器(rdp 朴素贝叶斯分类器、rtax 和 utax)进行了计算机模拟评估。此外,我们还讨论了分类器的结果、优点和局限性,以及我们的分类器评估方法。我们的比较方法简单但稳健,将为研究人员在各种研究情况下进行此类评估提供方法和概念基础。总的来说,我们发现测试的分类器在准确性和灵敏度之间存在相当大的权衡,这表明需要进一步改进序列分类工具。