School of Biological Sciences, The University of Queensland, St. Lucia, Australia.
Mosquito Control Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia.
BMC Genomics. 2022 Jun 7;23(1):426. doi: 10.1186/s12864-022-08628-z.
An optimal starting point for relating genome function to organismal biology is a high-quality nuclear genome assembly, and long-read sequencing is revolutionizing the production of this genomic resource in insects. Despite this, nuclear genome assemblies have been under-represented for agricultural insect pests, particularly from the order Coleoptera. Here we present a de novo genome assembly and structural annotation for the coconut rhinoceros beetle, Oryctes rhinoceros (Coleoptera: Scarabaeidae), based on Oxford Nanopore Technologies (ONT) long-read data generated from a wild-caught female, as well as the assembly process that also led to the recovery of the complete circular genome assemblies of the beetle's mitochondrial genome and that of the biocontrol agent, Oryctes rhinoceros nudivirus (OrNV). As an invasive pest of palm trees, O. rhinoceros is undergoing an expansion in its range across the Pacific Islands, requiring new approaches to management that may include strategies facilitated by genome assembly and annotation.
High-quality DNA isolated from an adult female was used to create four ONT libraries that were sequenced using four MinION flow cells, producing a total of 27.2 Gb of high-quality long-read sequences. We employed an iterative assembly process and polishing with one lane of high-accuracy Illumina reads, obtaining a final size of the assembly of 377.36 Mb that had high contiguity (fragment N50 length = 12 Mb) and accuracy, as evidenced by the exceptionally high completeness of the benchmarked set of conserved single-copy orthologous genes (BUSCO completeness = 99.1%). These quality metrics place our assembly ahead of the published Coleopteran genomes, including that of an insect model, the red flour beetle (Tribolium castaneum). The structural annotation of the nuclear genome assembly contained a highly-accurate set of 16,371 protein-coding genes, with only 2.8% missing BUSCOs, and the expected number of non-coding RNAs. The number and structure of paralogous genes in a gene family like Sigma GST is lower than in another scarab beetle (Onthophagus taurus), but higher than in the red flour beetle (Tribolium castaneum), which suggests expansion of this GST class in Scarabaeidae. The quality of our gene models was also confirmed with the correct placement of O. rhinoceros among other members of the rhinoceros beetles (subfamily Dynastinae) in a phylogeny based on the sequences of 95 protein-coding genes in 373 beetle species from all major lineages of Coleoptera. Finally, we provide a list of 30 candidate dsRNA targets whose orthologs have been experimentally validated as highly effective targets for RNAi-based control of several beetles.
The genomic resources produced in this study form a foundation for further functional genetic research and management programs that may inform the control and surveillance of O. rhinoceros populations, and we demonstrate the efficacy of de novo genome assembly using long-read ONT data from a single field-caught insect.
将基因组功能与生物体生物学联系起来的最佳起点是高质量的核基因组组装,而长读测序正在彻底改变昆虫中这种基因组资源的生产。尽管如此,核基因组组装在农业害虫中仍然没有得到充分体现,特别是在鞘翅目昆虫中。在这里,我们根据从野生捕获的雌性个体中生成的牛津纳米孔技术(ONT)长读数据,以及组装过程,为椰子犀牛甲虫(鞘翅目:犀金龟科)提供了从头基因组组装和结构注释,该组装过程还恢复了甲虫线粒体基因组和生物防治剂椰子犀牛核型多角体病毒(OrNV)的完整圆形基因组组装。作为棕榈树的入侵害虫,Oryctes rhinoceros 正在太平洋岛屿上的范围内扩大,这需要新的管理方法,其中可能包括基因组组装和注释所带来的策略。
从成年雌性中分离出高质量的 DNA,用于创建四个 ONT 文库,这些文库使用四个 MinION 流池进行测序,共产生了 27.2 Gb 的高质量长读序列。我们采用了迭代组装过程和使用一条高通量 Illumina reads 进行的抛光,获得了组装大小为 377.36 Mb 的最终组装,其具有高连续性(片段 N50 长度为 12 Mb)和准确性,这一点从经过基准测试的保守单拷贝直系同源基因集(BUSCO 完整性为 99.1%)的异常高完整性中得到了证明。这些质量指标使我们的组装超过了已发表的鞘翅目基因组,包括昆虫模型红粉甲虫(Tribolium castaneum)的组装。核基因组组装的结构注释包含了一组高度准确的 16,371 个蛋白质编码基因,只有 2.8%的 BUSCO 缺失,并且预期的非编码 RNA 数量也相同。在 Sigma GST 这样的基因家族中,旁系同源基因的数量和结构低于另一种金龟子(Onthophagus taurus),但高于红粉甲虫(Tribolium castaneum),这表明在 Scarabaeidae 中,这个 GST 类别的扩张。我们的基因模型的质量也通过在基于来自鞘翅目所有主要谱系的 373 种甲虫的 95 种蛋白质编码基因的序列构建的系统发育树中将 O. rhinoceros 正确地置于其他犀牛甲虫( Dynastinae 亚科)成员中得到了证实。最后,我们提供了 30 个候选 dsRNA 靶标的列表,其同源物已通过实验验证,是几种甲虫的 RNAi 控制的高度有效靶标。
本研究产生的基因组资源为进一步的功能遗传研究和管理计划奠定了基础,这些计划可能为 O. rhinoceros 种群的控制和监测提供信息,并且我们展示了使用来自单个野外捕获昆虫的长读 ONT 数据进行从头基因组组装的功效。