Kumar Santosh, Shah Niraj, Garg Vanika, Bhatia Sabhyata
National Institute of Plant Genome Research, Aruna Asaf Ali Marg, PO Box 10531, New Delhi, 110067, India.
Plant Cell Rep. 2014 Jun;33(6):905-18. doi: 10.1007/s00299-014-1569-8. Epub 2014 Feb 1.
Transcriptomic data of C. roseus offering ample sequence resources for providing better insights into gene diversity: large resource of genic SSR markers to accelerate genomic studies and breeding in Catharanthus . Next-generation sequencing is an efficient system for generating high-throughput complete transcripts/genes and developing molecular markers. We present here the transcriptome sequencing of a 26-day-old Catharanthus roseus seedling tissue using Illumina GAIIX platform that resulted in a total of 3.37 Gb of nucleotide sequence data comprising 29,964,104 reads which were de novo assembled into 26,581 unigenes. Based on similarity searches 58 % of the unigenes were annotated of which 13,580 unique transcripts were assigned 5016 gene ontology terms. Further, 7,687 of the unigenes were found to have Cluster of Orthologous Group classifications, and 4,006 were assigned to 289 Kyoto Encyclopedia of Genes and Genome pathways. Also, 5,221 (19.64 %) of transcripts were distributed to 81 known transcription factor (TF) families. In-silico analysis of the transcriptome resulted in identification of 11,004 SSRs in 26.62 % transcripts from which 2,520 SSR markers were designed which exhibited a non-random pattern of distribution. The most abundant was the trinucleotide repeats (AAG/CTT) followed by the dinucleotide repeats (AG/CT). Location specific analysis of SSRs revealed that SSRs were preferentially associated with the 5'-UTRs with a predicted role in regulation of gene expression. A PCR validation of a set of 48 primers revealed 97.9 % successful amplification, and 76.6 % of them showed polymorphism across different Catharanthus species as well as accessions of C. roseus. In summary, this study will provide an insight into understanding the seedling development and resources for novel gene discovery and SSR development for utilization in marker-assisted selective breeding in C. roseus.
长春花的转录组数据提供了丰富的序列资源,有助于更深入地了解基因多样性:大量的基因SSR标记资源可加速长春花的基因组研究和育种。新一代测序是一种高效的系统,可用于生成高通量的完整转录本/基因并开发分子标记。我们在此展示了使用Illumina GAIIX平台对26日龄长春花幼苗组织进行的转录组测序,共产生了3.37 Gb的核苷酸序列数据,包含29,964,104条读段,这些读段被从头组装成26,581个单基因。基于相似性搜索,58%的单基因得到注释,其中13,580个独特转录本被赋予5016个基因本体术语。此外,发现7,687个单基因具有直系同源簇分类,4,006个被分配到289个京都基因与基因组百科全书途径。同时,5,221个(19.64%)转录本分布到81个已知转录因子(TF)家族。对转录组的电子分析导致在26.62%的转录本中鉴定出11,004个SSR,从中设计了2,520个SSR标记,这些标记呈现出非随机分布模式。最丰富的是三核苷酸重复(AAG/CTT),其次是二核苷酸重复(AG/CT)。对SSR的定位特异性分析表明,SSR优先与5'-UTR相关,在基因表达调控中具有预测作用。对一组48个引物的PCR验证显示97.9%成功扩增,其中76.6%在不同长春花物种以及长春花的不同种质间表现出多态性。总之,本研究将有助于深入了解幼苗发育,并为长春花的新基因发现和SSR开发提供资源,以用于标记辅助选择育种。