Zhang X J, Jiang H Y, Li L M, Yuan L H, Chen J P
Guangdong Entomological Institute, Guangzhou, China.
Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangzhou, China.
Genet Mol Res. 2016 Jun 20;15(2):gmr7999. doi: 10.4238/gmr.15027999.
The aim of this study was to provide comprehensive insights into the genetic background of sturgeon by transcriptome study. We performed a de novo assembly of the Amur sturgeon Acipenser schrenckii transcriptome using Illumina Hiseq 2000 sequencing. A total of 148,817 non-redundant unigenes with base length of approximately 121,698,536 bp and ranges from 201 to 26,789 bp were obtained. All the unigenes were classified into 3368 distinct categories and 145,449 singletons by homologous transcript cluster analysis. In all, 46,865 (31.49%) unigenes showed homologous matches with Nr database and 32,214 (21.65%) unigenes were matched to Nt database. In total, 24,862 unigenes were categorized into significantly enriched 52 function groups by GO analysis, and 38,436 unigenes were classified into 25 groups by KOG prediction, as well as 128 enriched KEGG pathways were identified by 45,598 unigenes (P < 0.05). Subsequently, a total of 19,860 SSRs markers were identified with the abundant di-nucleotide type (10,658; 53.67%) and the most AT/TA motif repeats (2689; 13.54%). A total of 1341 conserved lncRNAs were identified by a customized pipeline. Our study provides new sequence and function information for A. schrenckii, which will be the basis for further genetic studies on sturgeon species. The huge number of potential SSRs and putatively conserved lncRNAs isolated by the transcriptome also shed light on research in many fields, including the evolution, conservation management, and biological processes in sturgeon.
本研究的目的是通过转录组研究全面洞察鲟鱼的遗传背景。我们使用Illumina Hiseq 2000测序对施氏鲟(Acipenser schrenckii)转录组进行了从头组装。共获得148,817个非冗余单基因,碱基长度约为121,698,536 bp,范围从201到26,789 bp。通过同源转录本聚类分析,所有单基因被分为3368个不同类别和145,449个单拷贝基因。总计,46,865个(31.49%)单基因与Nr数据库显示同源匹配,32,214个(21.65%)单基因与Nt数据库匹配。通过GO分析,总共24,862个单基因被分类到显著富集的52个功能组中,38,436个单基因通过KOG预测被分为25个组,同时45,598个单基因鉴定出128条富集的KEGG通路(P < 0.05)。随后,共鉴定出19,860个SSR标记,其中丰富的二核苷酸类型(10,658个;53.67%)和最多的AT/TA基序重复(2689个;13.54%)。通过定制流程鉴定出总共1341个保守lncRNA。我们的研究为施氏鲟提供了新的序列和功能信息,这将成为鲟鱼物种进一步遗传研究的基础。通过转录组分离出的大量潜在SSR和假定保守的lncRNA也为包括鲟鱼进化、保护管理和生物学过程等许多领域的研究提供了线索。