Department of Biology, East Carolina University, Greenville, NC 27858, USA.
Planta. 2012 Jul;236(1):101-13. doi: 10.1007/s00425-012-1591-4. Epub 2012 Jan 21.
High-throughput RNA sequencing was performed for comprehensively analyzing the transcriptome of the purple sweet potato. A total of 58,800 unigenes were obtained and ranged from 200 nt to 10,380 nt with an average length of 476 nt. The average expression of one unigene was 34 reads per kb per million reads (RPKM) with a maximum expression of 1,935 RPKM. At least 40,280 (68.5%) unigenes were identified to be protein-coding genes, in which 11,978 and 5,184 genes were homologous to Arabidopsis and rice proteins, respectively. Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analysis showed that 19,707 (33.5%) unigenes were classified to 1,807 terms of GO including molecular functions, biological processes, and cellular components and 9,970 (17.0%) unigenes were enriched to 11,119 KEGG pathways. We found that at least 3,553 genes may be involved in the biosynthesis pathways of starch, alkaloids, anthocyanin pigments, and vitamins. Additionally, 851 potential simple sequence repeats (SSRs) were identified in all unigenes. Transcriptome sequencing on tuberous roots of the sweet potato yielded substantial transcriptional sequences and potentially useful SSR markers which provide an important data source for sweet potato research. Comparison of two RNA-sequence datasets from the purple and the yellow sweet potato showed that UDP-glucose-flavonoid 3-O-glucosyltransferase was one of the key enzymes in the pathway of anthocyanin biosynthesis and that anthocyanin-3-glucoside might be one of the major components for anthocyanin pigments in the purple sweet potato. This study contributes to the molecular mechanisms of sweet potato development and metabolism and therefore that increases the potential utilization of the sweet potato in food nutrition and pharmacy.
高通量 RNA 测序被用于全面分析紫薯的转录组。共获得 58800 条非编码 RNA,长度在 200nt 到 10380nt 之间,平均长度为 476nt。每个非编码 RNA 的平均表达量为 34 个每 kb 每百万读取(RPKM),最大表达量为 1935 RPKM。至少有 40280(68.5%)条非编码 RNA 被鉴定为编码蛋白的基因,其中 11978 条和 5184 条分别与拟南芥和水稻的蛋白同源。GO 和 KEGG 分析表明,19707(33.5%)条非编码 RNA 被分为包括分子功能、生物过程和细胞组成的 1807 个 GO 术语,9970(17.0%)条非编码 RNA 被富集到 11119 个 KEGG 途径。我们发现,至少 3553 条基因可能参与淀粉、生物碱、花色苷和维生素的生物合成途径。此外,在所有非编码 RNA 中鉴定出 851 个潜在的简单序列重复(SSR)。甘薯块根的转录组测序产生了大量的转录序列和潜在有用的 SSR 标记,为甘薯研究提供了重要的数据来源。对紫薯和黄薯的两个 RNA-seq 数据集的比较表明,UDP-葡萄糖-黄酮 3-O-葡萄糖基转移酶是花色苷生物合成途径中的关键酶之一,花色苷-3-葡萄糖苷可能是紫薯花色苷的主要成分之一。本研究有助于阐明甘薯发育和代谢的分子机制,从而提高甘薯在食品营养和药学中的潜在利用价值。