College of Life Sciences, Capital Normal University, Beijing 100048, China.
Mol Ecol. 2012 Apr;21(8):1848-63. doi: 10.1111/j.1365-294X.2011.05235.x. Epub 2011 Aug 29.
Reliable assignment of an unknown query sequence to its correct species remains a methodological problem for the growing field of DNA barcoding. While great advances have been achieved recently, species identification from barcodes can still be unreliable if the relevant biodiversity has been insufficiently sampled. We here propose a new notion of species membership for DNA barcoding-fuzzy membership, based on fuzzy set theory-and illustrate its successful application to four real data sets (bats, fishes, butterflies and flies) with more than 5000 random simulations. Two of the data sets comprise especially dense species/population-level samples. In comparison with current DNA barcoding methods, the newly proposed minimum distance (MD) plus fuzzy set approach, and another computationally simple method, 'best close match', outperform two computationally sophisticated Bayesian and BootstrapNJ methods. The new method proposed here has great power in reducing false-positive species identification compared with other methods when conspecifics of the query are absent from the reference database.
可靠地将未知查询序列分配给其正确的物种仍然是 DNA 条形码领域日益增长的方法学问题。尽管最近取得了巨大的进展,如果相关生物多样性的采样不足,那么从条形码进行物种鉴定仍然可能不可靠。我们在这里提出了一种新的 DNA 条形码物种归属概念——模糊归属,基于模糊集理论,并成功地将其应用于四个真实数据集(蝙蝠、鱼类、蝴蝶和苍蝇)的 5000 多次随机模拟。其中两个数据集包含特别密集的物种/种群水平样本。与当前的 DNA 条形码方法相比,新提出的最小距离 (MD) 加模糊集方法和另一种计算简单的方法“最佳接近匹配”优于两种计算复杂的贝叶斯和 BootstrapNJ 方法。与其他方法相比,当查询的同种生物不在参考数据库中时,这里提出的新方法在减少假阳性物种鉴定方面具有强大的功能。