Yan Wei-Jie, Hussain Hasnain, Chung Hung Hui, Julaihi Norzainizul, Tommy Rina
Centre for Sago Research (CoSAR), Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia.
Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia.
Data Brief. 2022 Feb 3;41:107908. doi: 10.1016/j.dib.2022.107908. eCollection 2022 Apr.
Sago palm ( Rottb.) is an important agricultural starch-producing palm that contributes to Malaysia's economics, especially in the State of Sarawak, Malaysian Borneo. In this palm tree, the central part of the plant storage-starch. Under normal condition, sago palm develop its trunk after 4-5 years being planted. However, sago palms planted on deep-peat soil failed to develop their trunk even after 17 years of being planted. This phenomenon is known as 'non-trunking', which eliminates the economic value of the palms. Numerous research has been done to address the phenomenon, but the molecular mechanisms of sago palm responding toward the responsible stresses are still lacking. Therefore, in this study, leaf samples were collected from trunking (normal) and non-trunking sago palms planted on peat soil for total RNA extraction, followed by next-generation sequencing using the BGISEQ-500 platform. The raw reads were cleaned, and assembled using TRINITY software package. A total of 40.11 Gb bases were sequenced from the sago palm leaf samples. The assembled sequence produced 102,447 unigenes, with N50 score 1809 bp and GC ratio of 44.34%. The alignment of unigenes with seven functional databases (NR, NT, GO, KOG, KEGG, SwissProt and InterPro) resulted in the annotation of 65,523 (63.96%) unigenes. Functional annotation results in the detection of 46,335 coding DNA sequences by Transdecoder. A total of 30,039 simple-sequence repeats distributed on 21,676 unigenes were detected using Primer3 software, and 2355 transcription factor coding unigenes were predicted using getorf and hmmseach software. This work is registered under NCBI BioProject PRJNA781491. The raw RNA sequencing data are available in Sequence Read Archive (SRA) database with accession numbers SRX13165895, SRX13165896, SRX13165897, SRX13165898, SRX13165899, and SRX13165900. Gene expression and annotation information are accessible in public functional genomics data repository Gene Expression Omnibus (GEO) with accession number GSE189085.
西米棕榈(Rottb.)是一种重要的产农业淀粉棕榈,对马来西亚经济有贡献,特别是在马来西亚婆罗洲的沙捞越州。在这种棕榈树中,植物储存淀粉的中心部分。在正常情况下,西米棕榈种植4-5年后会长出树干。然而,种植在深泥炭土上的西米棕榈即使种植17年后也未能长出树干。这种现象被称为“无树干现象”,这消除了棕榈的经济价值。已经进行了大量研究来解决这一现象,但西米棕榈对相关胁迫反应的分子机制仍然缺乏。因此,在本研究中,从种植在泥炭土上的有树干(正常)和无树干西米棕榈采集叶片样本用于提取总RNA,随后使用BGISEQ-500平台进行下一代测序。原始读数经过清理,并使用TRINITY软件包进行组装。从西米棕榈叶片样本中共测序了40.11 Gb碱基。组装后的序列产生了102,447个单基因,N50评分为1809 bp,GC比例为44.34%。单基因与七个功能数据库(NR、NT、GO、KOG、KEGG、SwissProt和InterPro)的比对导致65,523个(63.96%)单基因得到注释。功能注释结果通过Transdecoder检测到46,335个编码DNA序列。使用Primer3软件在21,676个单基因上共检测到30,039个简单序列重复,并使用getorf和hmmseach软件预测了2355个转录因子编码单基因。这项工作已在NCBI生物项目PRJNA781491下注册。原始RNA测序数据可在序列读取存档(SRA)数据库中获取,登录号为SRX13165895、SRX13165896、SRX13165897、SRX13165898、SRX13165899和SRX13165900。基因表达和注释信息可在公共功能基因组学数据储存库基因表达综合数据库(GEO)中获取,登录号为GSE189085。