Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3220, Australia.
Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.
Gigascience. 2018 Mar 1;7(3):1-6. doi: 10.1093/gigascience/gix137.
Some of the most widely recognized coral reef fishes are clownfish or anemonefish, members of the family Pomacentridae (subfamily: Amphiprioninae). They are popular aquarium species due to their bright colours, adaptability to captivity, and fascinating behavior. Their breeding biology (sequential hermaphrodites) and symbiotic mutualism with sea anemones have attracted much scientific interest. Moreover, there are some curious geographic-based phenotypes that warrant investigation. Leveraging on the advancement in Nanopore long read technology, we report the first hybrid assembly of the clown anemonefish (Amphiprion ocellaris) genome utilizing Illumina and Nanopore reads, further demonstrating the substantial impact of modest long read sequencing data sets on improving genome assembly statistics.
We generated 43 Gb of short Illumina reads and 9 Gb of long Nanopore reads, representing approximate genome coverage of 54× and 11×, respectively, based on the range of estimated k-mer-predicted genome sizes of between 791 and 967 Mbp. The final assembled genome is contained in 6404 scaffolds with an accumulated length of 880 Mb (96.3% BUSCO-calculated genome completeness). Compared with the Illumina-only assembly, the hybrid approach generated 94% fewer scaffolds with an 18-fold increase in N50 length (401 kb) and increased the genome completeness by an additional 16%. A total of 27 240 high-quality protein-coding genes were predicted from the clown anemonefish, 26 211 (96%) of which were annotated functionally with information from either sequence homology or protein signature searches.
We present the first genome of any anemonefish and demonstrate the value of low coverage (∼11×) long Nanopore read sequencing in improving both genome assembly contiguity and completeness. The near-complete assembly of the A. ocellaris genome will be an invaluable molecular resource for supporting a range of genetic, genomic, and phylogenetic studies specifically for clownfish and more generally for other related fish species of the family Pomacentridae.
一些最广为人知的珊瑚礁鱼类是小丑鱼或海葵鱼,它们属于雀鲷科(副雀鲷亚科)。由于其鲜艳的颜色、对圈养的适应性和迷人的行为,它们成为受欢迎的水族馆物种。它们的繁殖生物学(顺序雌雄同体)和与海葵的共生互惠关系引起了科学界的极大兴趣。此外,还有一些基于地理位置的奇特表型值得研究。利用纳米孔长读技术的进步,我们报告了小丑鱼(Amphiprion ocellaris)基因组的首次杂交组装,该组装利用 Illumina 和纳米孔读长进一步证明了适度长读测序数据集对改善基因组组装统计数据的巨大影响。
我们生成了 43 Gb 的短 Illumina 读长和 9 Gb 的长纳米孔读长,分别代表估计的 k-mer 预测基因组大小在 791 到 967 Mbp 之间的大约 54×和 11×的基因组覆盖范围。最终组装的基因组包含在 6404 个支架中,累积长度为 880 Mb(基于 96.3%的 BUSCO 计算基因组完整性)。与仅使用 Illumina 的组装相比,杂交方法生成的支架数量减少了 94%,N50 长度增加了 18 倍(401 kb),并额外增加了 16%的基因组完整性。总共预测了小丑鱼的 27240 个高质量的蛋白质编码基因,其中 26211 个(96%)通过序列同源性或蛋白质特征搜索的信息在功能上得到了注释。
我们展示了第一个海葵鱼的基因组,并证明了低覆盖度(约 11×)长纳米孔读测序在提高基因组组装连续性和完整性方面的价值。A. ocellaris 基因组的近乎完整组装将成为支持一系列遗传、基因组和系统发育研究的宝贵分子资源,特别是对于小丑鱼,更广泛地说,对于雀鲷科的其他相关鱼类物种。