Lau Melinda Mei Lin, Lim Leonard Whye Kit, Chung Hung Hui, Gan Han Ming
Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak 94300, Malaysia.
GeneSEQ Sdn Bhd, Bandar Bukit Beruntung, Rawang, Selangor 48300, Malaysia.
Data Brief. 2021 Oct 14;39:107481. doi: 10.1016/j.dib.2021.107481. eCollection 2021 Dec.
The Javan mahseer () is one of the most valuable freshwater fish found in species. To date, other than mitogenomic data (BioProject: PRJNA422829), genomic and transcriptomic resources for this species are still lacking which is crucial to understand the molecular mechanisms associated with important traits such as growth, immune response, reproduction and sex determination. For the first time, we sequenced the transcriptome from a whole juvenile fish using Illumina NovaSEQ6000 generating raw paired-end reads. transcriptome assembly generated a draft transcriptome (BUSCO5 completeness of 91.2% [Actinopterygii_odb10 database]) consisting of 259,403 putative transcripts with a total and N50 length of 333,881,215 bp and 2283 bp, respectively. A total count of 77,503 non-redundant protein coding sequences were predicted from the transcripts and used for functional annotation. We mapped the predicted proteins to 304 known KEGG pathways with signal transduction cluster having the highest representation followed by immune system and endocrine system. In addition, transcripts exhibiting significant similarity to previously published growth-and immune-related genes were identified which will facilitate future molecular breeding of .
爪哇野鲮()是该物种中最具价值的淡水鱼之一。迄今为止,除了有丝分裂基因组数据(生物项目:PRJNA422829)外,该物种的基因组和转录组资源仍然匮乏,而这些资源对于理解与生长、免疫反应、繁殖和性别决定等重要性状相关的分子机制至关重要。我们首次使用Illumina NovaSEQ6000对一尾幼鱼的转录组进行测序,生成了原始双端读数。转录组组装产生了一个转录组草图(基于Actinopterygii_odb10数据库,BUSCO5完整性为91.2%),由259,403个推定转录本组成,总长度和N50长度分别为333,881,215 bp和2283 bp。从这些转录本中预测出总共77,503个非冗余蛋白质编码序列,并用于功能注释。我们将预测的蛋白质映射到304条已知的KEGG途径,其中信号转导簇的代表性最高,其次是免疫系统和内分泌系统。此外,还鉴定出了与先前发表的生长和免疫相关基因具有显著相似性的转录本,这将有助于爪哇野鲮未来的分子育种。