Pérez-Burillo Javier, Mann David G, Trobajo Rosa
IRTA-Institute for Food and Agricultural Research and Technology, Marine and Continental Waters Programme, Ctra de Poble Nou Km 5.5, E43540, LaRàpita, Tarragona, Spain; Departament de Geografia, Universitat Rovira i Virgili, C/ Joanot Martorell 15, E43500, Vila-seca, Tarragona, Spain.
IRTA-Institute for Food and Agricultural Research and Technology, Marine and Continental Waters Programme, Ctra de Poble Nou Km 5.5, E43540, LaRàpita, Tarragona, Spain; Royal Botanic Garden Edinburgh, Edinburgh, EH3 5LR, Scotland, UK.
Chemosphere. 2022 Nov;307(Pt 3):135933. doi: 10.1016/j.chemosphere.2022.135933. Epub 2022 Aug 8.
Two short diatom rbcL barcodes, 331 bp and 263 bp in length, have frequently been used in diatom metabarcoding studies. They overlap in a common 263-bp region but differ in the presence or absence of a 68-bp tail at the 5' end. Though the effectiveness of both has been demonstrated in separate biomonitoring and diversity studies, the impact of the 68-bp non-shared region has not been evaluated. Here we compare the two barcodes in terms of the values of a biotic index (IPS) and the ecological status classes derived from their application to an extensive metabarcoding dataset from United Kingdom rivers; this comprised 1703 samples and was produced using the 331-bp primers. In addition, we assess the effectiveness of each barcode for discrimination of genetic variants around and below the species level. The strong correlation found in IPS values between barcodes (Pearson's R = 0.98) indicates that the choice of the barcode does not have major implications for current WFD ecological assessments, although a very few sites (55: 3.23% of those analysed) were downgraded from an acceptable WFD class ("Good") to an unacceptable one ("Moderate"). Analyses of the taxonomic resolution of the two barcodes indicate that for many ASVs, the use of either marker - 263-bp and 331-bp - gives unambiguous assignations at species level though with differences in bootstrap confidence values. Such differences are caused by the stochasticity involved in the naïve Bayesian classifier used and by the fact that genetic distance, regarding closely related species, is increased when using the 331-bp barcode. However, in three cases, species differentiation fails with the shorter marker, leading to underestimates of species diversity. Finally, two ASVs from Nitzschia species evidenced that the use of the shorter marker can sometimes lead to false positives when the extent and nature of infraspecific variation are poorly known.
两个短的硅藻rbcL条形码,长度分别为331 bp和263 bp,在硅藻宏条形码研究中经常被使用。它们在一个263 bp的共同区域重叠,但在5'端是否存在一个68 bp的尾巴上有所不同。尽管两者的有效性已在单独的生物监测和多样性研究中得到证明,但68 bp非共享区域的影响尚未得到评估。在这里,我们根据生物指数(IPS)的值以及将它们应用于来自英国河流的大量宏条形码数据集所得到的生态状态类别,对这两个条形码进行比较;该数据集包含1703个样本,是使用331 bp引物生成的。此外,我们评估了每个条形码在区分物种水平及以下的遗传变异方面的有效性。条形码之间在IPS值上发现的强相关性(Pearson相关系数R = 0.98)表明,条形码的选择对当前水框架指令(WFD)的生态评估没有重大影响,尽管极少数站点(55个:占分析站点的3.23%)从可接受的WFD类别(“良好”)降级为不可接受的类别(“中等”)。对两个条形码的分类分辨率分析表明,对于许多扩增子序列变体(ASV),使用任何一个标记——263 bp和331 bp——在物种水平上都能给出明确的分类,尽管自展置信值存在差异。这种差异是由所用的朴素贝叶斯分类器中涉及的随机性以及使用331 bp条形码时密切相关物种之间遗传距离增加这一事实造成的。然而,在三种情况下,较短的标记无法实现物种分化,导致物种多样性被低估。最后,来自菱形藻属物种的两个ASV证明,当种内变异的程度和性质了解不足时,使用较短的标记有时会导致假阳性结果。