Key Laboratory of Breeding Biotechnology and Sustainable Aquaculture, Institute of Hydrobiology, The Innovative Academy of Seed Design, Hubei Hongshan Laboratory, Guangdong Laboratory for Lingnan Modern Agriculture, Chinese Academy of Sciences, Wuhan, 430072, China.
Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China.
Sci Data. 2024 Jun 22;11(1):675. doi: 10.1038/s41597-024-03495-7.
The greater amberjack is a very important fishery species with high commercial value, and it is distributed worldwide. Transcriptome-based studies on S. dumerili have been limited by an inadequate reference genome and a lack of well-annotated full-length transcripts. In this study, a total of 12 tissues from juvenile and adult fish both sexes were collected for next-generation RNA sequencing (RNA-seq) and full-length isoform sequencing (Iso-seq). For Iso-seq, a total of 163,218, 149,716, and 189,169 high-quality unique transcript sequences were obtained, with an N50 of 5,441, 5,255, and 5,939, from juvenile, adult male and adult female S. dumerili, respectively. We integrated the Iso-seq and RNA-seq data to construct a comprehensive gene annotation and systematically profiled the dynamics of gene expression across the 12 tissues. Our gene models had greater detail and accuracy than those from NCBI and Ensembl, with more precise polyA locations. These resources serve as a foundation for functional genomic studies and provide valuable insights into the molecular mechanisms underlying the development, reproduction and commercial traits of amberjack.
大菱鲆是一种具有重要商业价值的渔业物种,分布于世界各地。由于缺乏充分注释的全长转录本和不完善的参考基因组,基于转录组的 S. dumerili 研究受到限制。在这项研究中,我们从幼鱼和成年雌雄鱼的 12 种组织中采集了下一代 RNA 测序(RNA-seq)和全长异构体测序(Iso-seq)的数据。对于 Iso-seq,我们分别从幼鱼、成年雄鱼和成年雌鱼中获得了总计 163,218,149、149,716 和 189,169 条高质量的独特转录序列,N50 值分别为 5,441、5,255 和 5,939。我们整合了 Iso-seq 和 RNA-seq 数据,构建了一个全面的基因注释,并系统地分析了 12 种组织中的基因表达动态。我们的基因模型比 NCBI 和 Ensembl 的更详细、更准确,具有更精确的 polyA 位置。这些资源为功能基因组学研究提供了基础,并为大菱鲆的发育、繁殖和商业性状的分子机制提供了有价值的见解。