Department of Environmental Science, Policy and Management, University of California, Berkeley, California, USA.
Department of Biogeography, Faculty of Regional and Environmental Sciences, Trier University, Trier, Germany.
Mol Ecol Resour. 2021 Aug;21(6):1755-1758. doi: 10.1111/1755-0998.13414. Epub 2021 Jul 1.
DNA metabarcoding is a popular methodology for biodiversity assessment and increasingly used for community level analysis of intraspecific genetic diversity. The evolutionary history of hundreds of specimens can be captured in a single collection vial. However, the method is not without pitfalls, which may inflate or misrepresent recovered diversity metrics. Nuclear pseudogene copies of mitochondrial DNA (numts) have been particularly difficult to control because they can evolve rapidly and appear deceptively similar to true mitochondrial sequences. While the problem of numts has long been recognized for traditional sequencing approaches, the issues they create are particularly evident in metabarcoding in which the identity of individual specimens is generally not known. In this issue of Molecular Ecology Resources, Andújar et al. (2021) provide an easy to implement bioinformatic approach to reduce erroneous sequences due to numts and residual noise in metabarcoding data sets. The metaMATE software designates input sequences as authentic (mtDNA haplotypes) or nonauthentic (numts and erroneous sequences) by comparison to reference data and by analysing nucleotide substitution patterns. Filtering is applied over a range of abundance thresholds and the choice to proceed with a more rigid or less strict sequence removal strategy is at the researchers' discretion. This is a valuable addition to a growing number of complementary tools for improving the reliability of modern biodiversity monitoring.
DNA 代谢组学是一种用于生物多样性评估的流行方法,并且越来越多地用于种内遗传多样性的群落水平分析。数百个标本的进化历史可以在单个采集管中捕获。然而,该方法并非没有缺陷,它可能会夸大或错误地表示恢复的多样性指标。线粒体 DNA(numts)的核假基因拷贝特别难以控制,因为它们可以快速进化,并且看起来与真正的线粒体序列惊人地相似。虽然 numts 问题长期以来一直被传统测序方法所认识,但在代谢组学中,它们造成的问题尤为明显,因为代谢组学中通常不知道单个标本的身份。在本期《分子生态学资源》中,Andújar 等人(2021)提供了一种易于实施的生物信息学方法,以减少由于 numts 和代谢组学数据集残留噪声而导致的错误序列。metaMATE 软件通过与参考数据比较和分析核苷酸取代模式,将输入序列指定为真实(mtDNA 单倍型)或非真实(numts 和错误序列)。过滤适用于一系列丰度阈值,并且是否采用更严格或不太严格的序列去除策略由研究人员自行决定。这是越来越多用于提高现代生物多样性监测可靠性的互补工具的一个有价值的补充。