Huang Yating, Wu Xiaoqiu, Jian Duan, Zhan Yaguang, Fan Guizhi
Department of Forest Bioengineering, College of Life Science, Northeast Forestry University , Harbin , P.R. China.
Biotechnol Biotechnol Equip. 2015 Mar 4;29(2):395-403. doi: 10.1080/13102818.2015.1008228. Epub 2015 Feb 5.
The aim of this study was to facilitate gene discovery for functional genome studies and to identify simple sequence repeat (SSR) markers for molecular-assisted selection in . The transcriptome of was sequenced using а high-throughput RNA sequencing system - the Illumina Hiseq 2000. A total of 16,383,818 clean sequencing reads, 35,532 contigs and 25,811 unigenes were postulated. Based on similarity searches with known proteins, 19,350 genes (74.97% of the unigenes) were annotated. In the present research, 19,266, 10,978 and 7831 unigenes were mapped in Nr, Swiss-Prot and clusters of orthologous groups (COG) classifications, respectively. Of all unigenes, 6845 were categorized into three functional groups, namely biological process, cellular components and molecular function and 11,088 were annotated to 108 pathways by searching the Kyoto Encyclopedia of Genes and Genomes pathway database. A total of 1129 SSRs were identified in these unigenes. In addition, 23 candidate genes, potentially involved in sterol biosynthesis, were identified and were worthy of further investigation.
本研究的目的是促进功能基因组研究中的基因发现,并鉴定用于分子辅助选择的简单序列重复(SSR)标记。使用高通量RNA测序系统——Illumina Hiseq 2000对[物种名称未给出]的转录组进行测序。推测共获得16,383,818条干净的测序读段、35,532个重叠群和25,811个单基因。基于与已知蛋白质的相似性搜索,注释了19,350个基因(占单基因的74.97%)。在本研究中,分别有19,266、10,978和7,831个单基因被映射到Nr、Swiss-Prot和直系同源簇(COG)分类中。在所有单基因中,通过搜索京都基因与基因组百科全书(KEGG)通路数据库,6,845个被归类为三个功能组,即生物过程、细胞成分和分子功能,11,088个被注释到108条通路。在这些单基因中总共鉴定出1129个SSR。此外,鉴定出23个可能参与甾醇生物合成的候选基因,值得进一步研究。