Schloss Patrick D
Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan USA.
mSystems. 2016 Apr 26;1(2). doi: 10.1128/mSystems.00027-16. eCollection 2016 Mar-Apr.
Assignment of 16S rRNA gene sequences to operational taxonomic units (OTUs) allows microbial ecologists to overcome the inconsistencies and biases within bacterial taxonomy and provides a strategy for clustering similar sequences that do not have representatives in a reference database. I have applied the Matthews correlation coefficient to assess the ability of 15 reference-independent and -dependent clustering algorithms to assign sequences to OTUs. This metric quantifies the ability of an algorithm to reflect the relationships between sequences without the use of a reference and can be applied to any data set or method. The most consistently robust method was the average neighbor algorithm; however, for some data sets, other algorithms matched its performance.
将16S rRNA基因序列分配到操作分类单元(OTU)中,能使微生物生态学家克服细菌分类学中存在的不一致性和偏差,并为聚类相似序列提供了一种策略,这些相似序列在参考数据库中没有代表序列。我应用马修斯相关系数来评估15种不依赖参考和依赖参考的聚类算法将序列分配到OTU的能力。该指标量化了一种算法在不使用参考的情况下反映序列之间关系的能力,并且可以应用于任何数据集或方法。最稳定可靠的方法是平均邻居算法;然而,对于某些数据集,其他算法的性能与之相当。