Cardoso Sara D, Jiang Chunxi, Sun Lina, Zhang Libin, Gonçalves David
Institute of Science and Environment, University of Saint Joseph, Rua de Londres 106, Macau, SAR, China.
CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China.
Sci Data. 2024 Dec 23;11(1):1424. doi: 10.1038/s41597-024-04242-8.
The peacock blenny Salaria pavo is notorious for its extreme male sexual polymorphism, with large males defending nests and younger reproductive males mimicking the appearance and behavior of females to parasitically fertilize eggs. The lack of a reference genome has, to date, limited the understanding of the genetic basis of the species phenotypic plasticity. Here, we present the first reference genome assembly of the peacock blenny using PacBio HiFi long-reads and Hi-C sequencing data. The final assembly of the S. pavo genome spanned 735.90 Mbp, with a contig N50 of 3.69 Mbp and a scaffold N50 of 31.87 Mbp. A total of 98.77% of the assembly was anchored to 24 chromosomes. In total, 24,008 protein-coding genes were annotated, and 99.0% of BUSCO genes were fully represented. Comparative analyses with closely related species showed that 86.2% of these genes were assigned to orthogroups. This high-quality genome of S. pavo will be a valuable resource for future research on this species' reproductive plasticity and evolutionary history.
孔雀鳚(Salaria pavo)以其极端的雄性性多态性而臭名昭著,大型雄性保卫巢穴,而较年轻的繁殖雄性模仿雌性的外观和行为以寄生方式使卵受精。迄今为止,由于缺乏参考基因组,对该物种表型可塑性的遗传基础的理解受到了限制。在此,我们使用PacBio HiFi长读长和Hi-C测序数据展示了孔雀鳚的首个参考基因组组装。孔雀鳚基因组的最终组装跨度为735.90 Mbp,重叠群N50为3.69 Mbp,支架N50为31.87 Mbp。总共98.77%的组装序列被锚定到24条染色体上。总共注释了24,008个蛋白质编码基因,99.0%的BUSCO基因得到了完整代表。与近缘物种的比较分析表明,这些基因中有86.2%被归入直系同源组。这个高质量的孔雀鳚基因组将成为未来研究该物种繁殖可塑性和进化历史的宝贵资源。