Mutanen Marko, Kekkonen Mari, Prosser Sean W J, Hebert Paul D N, Kaila Lauri
Biodiversity Unit, Department of Biology, University of Oulu, P.O. Box 3000, FI-90014, Oulu, Finland.
Zoology Unit, Finnish Museum of Natural History, University of Helsinki, P.O. Box 17, FI-00014, Helsinki, Finland.
Mol Ecol Resour. 2015 Jul;15(4):967-84. doi: 10.1111/1755-0998.12361. Epub 2015 Jan 5.
Each holotype specimen provides the only objective link to a particular Linnean binomen. Sequence information from them is increasingly valuable due to the growing usage of DNA barcodes in taxonomy. As type specimens are often old, it may only be possible to recover fragmentary sequence information from them. We tested the efficacy of short sequences from type specimens in the resolution of a challenging taxonomic puzzle: the Elachista dispunctella complex which includes 64 described species with minuscule morphological differences. We applied a multistep procedure to resolve the taxonomy of this species complex. First, we sequenced a large number of newly collected specimens and as many holotypes as possible. Second, we used all >400 bp examine species boundaries. We employed three unsupervised methods (BIN, ABGD, GMYC) with specified criteria on how to handle discordant results and examined diagnostic bases from each delineated putative species (operational taxonomic units, OTUs). Third, we evaluated the morphological characters of each OTU. Finally, we associated short barcodes from types with the delineated OTUs. In this step, we employed various supervised methods, including distance-based, tree-based and character-based. We recovered 658 bp barcode sequences from 194 of 215 fresh specimens and recovered an average of 141 bp from 33 of 42 holotypes. We observed strong congruence among all methods and good correspondence with morphology. We demonstrate potential pitfalls with tree-, distance- and character-based approaches when associating sequences of varied length. Our results suggest that sequences as short as 56 bp can often provide valuable taxonomic information. The results support significant taxonomic oversplitting of species in the Elachista dispunctella complex.
每个模式标本都提供了与特定林奈双名法的唯一客观联系。由于DNA条形码在分类学中的使用日益增加,来自它们的序列信息变得越来越有价值。由于模式标本往往年代久远,可能只能从它们中获取零碎的序列信息。我们测试了模式标本的短序列在解决一个具有挑战性的分类难题中的有效性:分散埃夜蛾复合体,其中包括64个已描述的物种,形态差异极小。我们应用了一个多步骤程序来解决这个物种复合体的分类问题。首先,我们对大量新采集的标本以及尽可能多的模式标本进行了测序。其次,我们使用了所有长度大于400 bp的序列来检验物种界限。我们采用了三种无监督方法(BIN、ABGD、GMYC),并对如何处理不一致的结果制定了特定标准,同时检查了每个划定的假定物种(操作分类单元,OTU)的诊断碱基。第三,我们评估了每个OTU的形态特征。最后,我们将来自模式标本的短条形码与划定的OTU相关联。在这一步中,我们采用了各种有监督的方法,包括基于距离的、基于树的和基于特征的方法。我们从215个新鲜标本中的194个中获得了658 bp的条形码序列,并从42个模式标本中的33个中平均获得了141 bp的序列。我们观察到所有方法之间有很强的一致性,并且与形态学有很好的对应关系。我们展示了在关联不同长度序列时基于树、距离和特征的方法存在的潜在缺陷。我们的结果表明,短至56 bp的序列通常可以提供有价值的分类信息。这些结果支持了分散埃夜蛾复合体中物种在分类学上存在显著的过度细分。