Department of Zoology, Victorian Centre for Aquatic Pollution Identification and Management (CAPIM), The University of Melbourne, Victoria 3010, Australia.
Front Zool. 2013 Aug 7;10(1):45. doi: 10.1186/1742-9994-10-45.
Invertebrate communities are central to many environmental monitoring programs. In freshwater ecosystems, aquatic macroinvertebrates are collected, identified and then used to infer ecosystem condition. Yet the key step of species identification is often not taken, as it requires a high level of taxonomic expertise, which is lacking in most organizations, or species cannot be identified as they are morphologically cryptic or represent little known groups. Identifying species using DNA sequences can overcome many of these issues; with the power of next generation sequencing (NGS), using DNA sequences for routine monitoring becomes feasible.
In this study, we test if NGS can be used to identify species from field-collected samples in an important bioindicator group, the Chironomidae. We show that Cytochrome oxidase I (COI) and Cytochrome B (CytB) sequences provide accurate DNA barcodes for chironomid species. We then develop a NGS analysis pipeline to identifying species using megablast searches of high quality sequences generated using 454 pyrosequencing against comprehensive reference libraries of Sanger-sequenced voucher specimens. We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences. Accurate identification depends on having at least five sequences for a species; below this level species not expected in samples were detected. Incorrect incorporation of some multiplex identifiers (MID's) used to tag samples was a likely cause, and most errors could be detected when using MID tags on forward and reverse primers. We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data.
Next generation sequencing using two genes was successful for identifying chironomid species. However, when detecting species from 454 pyrosequencing data sets it was critical to include known individuals for quality control and to establish thresholds for detecting species. The NGS approach developed here can lead to routine species-level diagnostic monitoring of aquatic ecosystems.
无脊椎动物群落是许多环境监测计划的核心。在淡水生态系统中,采集水生大型无脊椎动物,进行鉴定,然后用于推断生态系统状况。然而,关键的物种鉴定步骤通常未被采用,因为这需要高水平的分类学专业知识,而大多数组织都缺乏这种知识,或者因为物种无法被鉴定,因为它们在形态上是隐秘的,或者代表着鲜为人知的群体。使用 DNA 序列鉴定物种可以克服许多这些问题;随着下一代测序(NGS)的发展,使用 DNA 序列进行常规监测变得可行。
在这项研究中,我们测试了 NGS 是否可用于鉴定重要生物指标类群摇蚊科的野外采集样本中的物种。我们表明,细胞色素氧化酶 I(COI)和细胞色素 B(CytB)序列为摇蚊物种提供了准确的 DNA 条码。然后,我们开发了一种 NGS 分析管道,通过对高质量序列进行 mega blast 搜索来识别物种,这些高质量序列是使用 454 焦磷酸测序生成的,针对的是 Sanger 测序凭证标本的综合参考文库。我们发现,454 生成的 COI 序列可成功识别样本中高达 96%的物种,但当与 CytB 序列结合使用时,这一比例增加到 99%。准确的鉴定取决于每个物种至少有五个序列;低于这个水平,预计在样本中不会出现的物种就会被检测到。可能的原因是一些用于标记样本的多重标识符(MID)的不正确合并,并且当在正向和反向引物上使用 MID 标签时,大多数错误都可以被检测到。我们还发现 454 序列的数量与个体数量之间存在很强的定量关系,表明可能可以从 454 焦磷酸测序数据估计物种的丰度。
使用两个基因的下一代测序成功地用于鉴定摇蚊物种。然而,当从 454 焦磷酸测序数据集检测物种时,必须包含已知个体以进行质量控制,并建立检测物种的阈值。这里开发的 NGS 方法可以导致对水生生态系统的常规物种水平诊断监测。