Musée National d'Histoire Naturelle, 25 Rue Münster, 2160 Luxembourg, Luxembourg.
University of Duisburg-Essen, Faculty of Biology, Aquatic Ecosystem Research, Universitaetsstr. 5, 45141 Essen, Germany.
Sci Total Environ. 2019 Aug 15;678:499-524. doi: 10.1016/j.scitotenv.2019.04.247. Epub 2019 Apr 27.
Effective identification of species using short DNA fragments (DNA barcoding and DNA metabarcoding) requires reliable sequence reference libraries of known taxa. Both taxonomically comprehensive coverage and content quality are important for sufficient accuracy. For aquatic ecosystems in Europe, reliable barcode reference libraries are particularly important if molecular identification tools are to be implemented in biomonitoring and reports in the context of the EU Water Framework Directive (WFD) and the Marine Strategy Framework Directive (MSFD). We analysed gaps in the two most important reference databases, Barcode of Life Data Systems (BOLD) and NCBI GenBank, with a focus on the taxa most frequently used in WFD and MSFD. Our analyses show that coverage varies strongly among taxonomic groups, and among geographic regions. In general, groups that were actively targeted in barcode projects (e.g. fish, true bugs, caddisflies and vascular plants) are well represented in the barcode libraries, while others have fewer records (e.g. marine molluscs, ascidians, and freshwater diatoms). We also found that species monitored in several countries often are represented by barcodes in reference libraries, while species monitored in a single country frequently lack sequence records. A large proportion of species (up to 50%) in several taxonomic groups are only represented by private data in BOLD. Our results have implications for the future strategy to fill existing gaps in barcode libraries, especially if DNA metabarcoding is to be used in the monitoring of European aquatic biota under the WFD and MSFD. For example, missing species relevant to monitoring in multiple countries should be prioritized for future collaborative programs. We also discuss why a strategy for quality control and quality assurance of barcode reference libraries is needed and recommend future steps to ensure full utilisation of metabarcoding in aquatic biomonitoring.
使用短 DNA 片段(DNA 条形码和 DNA 元条形码)有效识别物种需要可靠的已知分类单元序列参考文库。分类全面覆盖和内容质量对于足够的准确性都很重要。对于欧洲的水生生态系统,如果要在欧盟水框架指令(WFD)和海洋战略框架指令(MSFD)的背景下将分子鉴定工具应用于生物监测和报告,那么可靠的条形码参考文库尤为重要。我们分析了两个最重要的参考数据库,即生命条形码数据系统(BOLD)和 NCBI GenBank 中的差距,重点是 WFD 和 MSFD 中最常使用的分类单元。我们的分析表明,分类群之间以及地理区域之间的覆盖范围差异很大。一般来说,在条形码项目中积极针对的群体(例如鱼类、真昆虫、石蛾和维管植物)在条形码库中得到了很好的代表,而其他群体的记录较少(例如海洋软体动物、海鞘和淡水硅藻)。我们还发现,在几个国家监测的物种通常在参考库中有条形码代表,而在单个国家监测的物种经常缺乏序列记录。在几个分类群中,高达 50%的物种仅在 BOLD 中有私人数据代表。我们的结果对未来填补条形码库中现有空白的策略有影响,特别是如果要在 WFD 和 MSFD 下使用 DNA 元条形码对欧洲水生生物群进行监测。例如,应优先考虑针对多个国家监测的缺失物种,以进行未来的合作计划。我们还讨论了为什么需要条形码参考文库的质量控制和质量保证策略,并建议未来的步骤以确保在水生生物监测中充分利用元条形码。