Institute of Biochemistry and Genetics, Ufa Federal Research Center of Russian Academy of Sciences, Ufa 450054, Russia.
SCAMT Institute, ITMO University, Saint-Petersburg 191002, Russia.
G3 (Bethesda). 2021 Sep 6;11(9). doi: 10.1093/g3journal/jkab223.
Apis mellifera L., the western honey bee is a major crop pollinator that plays a key role in beekeeping and serves as an important model organism in social behavior studies. Recent efforts have improved on the quality of the honey bee reference genome and developed a chromosome-level assembly of 16 chromosomes, two of which are gapless. However, the rest suffer from 51 gaps, 160 unplaced/unlocalized scaffolds, and the lack of 2 distal telomeres. The gaps are located at the hard-to-assemble extended highly repetitive chromosomal regions that may contain functional genomic elements. Here, we use de novo re-assemblies from the most recent reference genome Amel_HAv_3.1 raw reads and other long-read-based assemblies (INRA_AMelMel_1.0, ASM1384120v1, and ASM1384124v1) of the honey bee genome to resolve 13 gaps, five unplaced/unlocalized scaffolds and, the lacking telomeres of the Amel_HAv_3.1. The total length of the resolved gaps is 848,747 bp. The accuracy of the corrected assembly was validated by mapping PacBio reads and performing gene annotation assessment. Comparative analysis suggests that the PacBio-reads-based assemblies of the honey bee genomes failed in the same highly repetitive extended regions of the chromosomes, especially on chromosome 10. To fully resolve these extended repetitive regions, further work using ultra-long Nanopore sequencing would be needed. Our updated assembly facilitates more accurate reference-guided scaffolding and marker/sequence mapping in honey bee genomics studies.
西方蜜蜂(Apis mellifera L.)是一种重要的作物传粉媒介,在养蜂业中发挥着关键作用,同时也是社会行为研究中的重要模式生物。最近的研究工作提高了蜜蜂参考基因组的质量,并开发了 16 条染色体的染色体水平组装,其中两条是无间隙的。然而,其余的染色体存在 51 个间隙、160 个未定位/未本地化的支架,以及缺少 2 个远端端粒。这些间隙位于难以组装的扩展高度重复的染色体区域,这些区域可能包含功能基因组元件。在这里,我们使用最新参考基因组 Amel_HAv_3.1 的从头重新组装以及其他基于长读的基因组组装(INRA_AMelMel_1.0、ASM1384120v1 和 ASM1384124v1)来解决 Amel_HAv_3.1 中的 13 个间隙、5 个未定位/未本地化的支架和缺失的端粒。解决的间隙总长度为 848747bp。通过映射 PacBio 读数和进行基因注释评估来验证校正组装的准确性。比较分析表明,蜜蜂基因组的 PacBio 读数组装在染色体的相同高度重复扩展区域失败,特别是在 10 号染色体上。要完全解决这些扩展的重复区域,需要使用超长纳米孔测序进一步工作。我们的更新组装有助于在蜜蜂基因组学研究中更准确地进行参考指导支架和标记/序列映射。