Department of Biological Sciences, 132 Long Hall, Clemson University, Clemson, SC, 29634, USA.
Smithsonian Marine Station at Fort Pierce, 701 Seaway Drive, Fort Pierce, Florida, 34949, USA.
BMC Genomics. 2022 Apr 22;23(1):320. doi: 10.1186/s12864-022-08482-z.
Whole mitochondrial genomes are quickly becoming markers of choice for the exploration of within-species genealogical and among-species phylogenetic relationships. Most often, 'primer walking' or 'long PCR' strategies plus Sanger sequencing or low-pass whole genome sequencing using Illumina short reads are used for the assembling of mitochondrial chromosomes. In this study, we first confirmed that mitochondrial genomes can be sequenced from long reads using nanopore sequencing data exclusively. Next, we examined the accuracy of the long-reads assembled mitochondrial chromosomes when comparing them to a 'gold' standard reference mitochondrial chromosome assembled using Illumina short-reads sequencing.
Using a specialized bioinformatics tool, we first produced a short-reads mitochondrial genome assembly for the silky shark C. falciformis with an average base coverage of 9.8x. The complete mitochondrial genome of C. falciformis was 16,705 bp in length and 934 bp shorter than a previously assembled genome (17,639 bp in length) that used bioinformatics tools not specialized for the assembly of mitochondrial chromosomes. Next, low-pass whole genome sequencing using a MinION ONT pocket-sized platform plus customized de-novo and reference-based workflows assembled and circularized a highly accurate mitochondrial genome in the silky shark Carcharhinus falciformis. Indels at the flanks of homopolymer regions explained most of the dissimilarities observed between the 'gold' standard reference mitochondrial genome (assembled using Illumina short reads) and each of the long-reads mitochondrial genome assemblies. Although not completely accurate, mitophylogenomics and barcoding analyses (using entire mitogenomes and the D-Loop/Control Region, respectively) suggest that long-reads assembled mitochondrial genomes are reliable for identifying a sequenced individual, such as C. falciformis, and separating the same individual from others belonging to closely related congeneric species.
This study confirms that mitochondrial genomes can be sequenced from long-reads nanopore sequencing data exclusively. With further development, nanopore technology can be used to quickly test in situ mislabeling in the shark fin fishing industry and thus, improve surveillance protocols, law enforcement, and the regulation of this fishery. This study will also assist with the transferring of high-throughput sequencing technology to middle- and low-income countries so that international scientists can explore population genomics in sharks using inclusive research strategies. Lastly, we recommend assembling mitochondrial genomes using specialized assemblers instead of other assemblers developed for bacterial and/or nuclear genomes.
全线粒体基因组正迅速成为探索种内系统发育和种间系统发育关系的首选标记。通常情况下,使用“引物行走”或“长 PCR”策略加上桑格测序或使用 Illumina 短读长进行的低通全基因组测序,用于组装线粒体染色体。在这项研究中,我们首先证实仅使用纳米孔测序数据即可从长读长中测序线粒体基因组。接下来,我们比较了使用 Illumina 短读测序组装的“金标准”参考线粒体染色体,评估了组装的长读长线粒体染色体的准确性。
使用专门的生物信息学工具,我们首先为丝鲨 C. falciformis 生成了一个平均碱基覆盖率为 9.8x 的短读长线粒体基因组组装。C. falciformis 的完整线粒体基因组长 16705bp,比之前组装的基因组(长 17639bp)短 934bp,之前的基因组使用的生物信息学工具不专门用于组装线粒体染色体。接下来,使用 MinION ONT 袖珍平台进行低通全基因组测序,外加定制的从头组装和基于参考的工作流程,组装并环状化了丝鲨 Carcharhinus falciformis 的高度准确的线粒体基因组。侧翼同源多聚体区域的插入/缺失解释了在“金标准”参考线粒体基因组(使用 Illumina 短读长组装)和每个长读长线粒体基因组组装之间观察到的大多数差异。尽管不够准确,但线粒体基因组系统发育分析(分别使用整个线粒体基因组和 D 环/控制区)和条形码分析表明,组装的线粒体基因组可用于识别测序个体,例如 C. falciformis,并将同一个体与属于近缘同属种的其他个体区分开来。
本研究证实,线粒体基因组可以仅从长读纳米孔测序数据中测序。随着进一步的发展,纳米孔技术可用于快速检测鲨鱼鳍捕鱼业中的错位标签,从而改进监测协议、执法和渔业管理。本研究还将有助于将高通量测序技术转移到中低收入国家,以便国际科学家能够使用包容性研究策略探索鲨鱼的群体基因组学。最后,我们建议使用专门的组装器组装线粒体基因组,而不是其他为细菌和/或核基因组开发的组装器。