Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, EPHE, IRD, Université de Montpellier, Montpellier, France.
Laboratório Multidisciplinar para Análise de Dados (LAMPADA), Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
Mol Ecol Resour. 2020 Jul;20(4):892-905. doi: 10.1111/1755-0998.13160. Epub 2020 Apr 25.
Thanks to the development of high-throughput sequencing technologies, target enrichment sequencing of nuclear ultraconserved DNA elements (UCEs) now allows routine inference of phylogenetic relationships from thousands of genomic markers. Recently, it has been shown that mitochondrial DNA (mtDNA) is frequently sequenced alongside the targeted loci in such capture experiments. Despite its broad evolutionary interest, mtDNA is rarely assembled and used in conjunction with nuclear markers in capture-based studies. Here, we developed MitoFinder, a user-friendly bioinformatic pipeline, to efficiently assemble and annotate mitogenomic data from hundreds of UCE libraries. As a case study, we used ants (Formicidae) for which 501 UCE libraries have been sequenced whereas only 29 mitogenomes are available. We compared the efficiency of four different assemblers (IDBA-UD, MEGAHIT, MetaSPAdes, and Trinity) for assembling both UCE and mtDNA loci. Using MitoFinder, we show that metagenomic assemblers, in particular MetaSPAdes, are well suited to assemble both UCEs and mtDNA. Mitogenomic signal was successfully extracted from all 501 UCE libraries, allowing us to confirm species identification using CO1 barcoding. Moreover, our automated procedure retrieved 296 cases in which the mitochondrial genome was assembled in a single contig, thus increasing the number of available ant mitogenomes by an order of magnitude. By utilizing the power of metagenomic assemblers, MitoFinder provides an efficient tool to extract complementary mitogenomic data from UCE libraries, allowing testing for potential mitonuclear discordance. Our approach is potentially applicable to other sequence capture methods, transcriptomic data and whole genome shotgun sequencing in diverse taxa. The MitoFinder software is available from GitHub (https://github.com/RemiAllio/MitoFinder).
得益于高通量测序技术的发展,现在可以通过对核超保守 DNA 元件 (UCEs) 的目标富集测序,从数千个基因组标记中常规推断系统发育关系。最近,已经表明在这种捕获实验中,线粒体 DNA (mtDNA) 通常与靶向基因座一起被测序。尽管 mtDNA 在进化上具有广泛的意义,但在基于捕获的研究中,mtDNA 很少与核标记一起组装和使用。在这里,我们开发了 MitoFinder,这是一个用户友好的生物信息学管道,可用于从数百个 UCE 文库中高效组装和注释线粒体基因组数据。作为一个案例研究,我们使用了蚂蚁(Formicidae),已经对其进行了 501 个 UCE 文库的测序,但仅提供了 29 个线粒体基因组。我们比较了四种不同的组装器(IDBA-UD、MEGAHIT、MetaSPAdes 和 Trinity)对组装 UCE 和 mtDNA 基因座的效率。使用 MitoFinder,我们表明,宏基因组组装器,特别是 MetaSPAdes,非常适合组装 UCE 和 mtDNA。从所有 501 个 UCE 文库中成功提取了线粒体基因组信号,允许我们使用 CO1 条形码确认物种鉴定。此外,我们的自动化程序检索到 296 个情况下,线粒体基因组被组装成单个连续体,从而将可用的蚂蚁线粒体基因组数量增加了一个数量级。通过利用宏基因组组装器的功能,MitoFinder 提供了一种从 UCE 文库中提取互补线粒体基因组数据的有效工具,允许测试潜在的线粒体核不和谐。我们的方法可能适用于其他序列捕获方法、转录组数据和不同类群的全基因组鸟枪法测序。MitoFinder 软件可从 GitHub(https://github.com/RemiAllio/MitoFinder)获得。