Department of Plant Resources, College of Industrial Sciences, Kongju National University, Yesan 340-802, Republic of Korea.
Mol Biol Rep. 2012 Mar;39(3):3327-35. doi: 10.1007/s11033-011-1102-x. Epub 2011 Jun 25.
Transcriptome from high throughput sequencing-by-synthesis is a good resource of molecular markers. In this study, we present utility of massively parallel sequencing by synthesis for profiling the transcriptome of red pepper (Capsicum annuum L. TF68) using 454 GS-FLX pyrosequencing. Through the generation of approximately 30.63 megabases (Mb) of expressed sequence tag (EST) data with the average length of 375 base pairs (bp), 9,818 contigs and 23,712 singletons were obtained by raw reads assembly. Using BLAST alignment against NCBI non-redundant and a UniProt protein database, 30% of the tentative consensus sequences were assigned to specific function annotation, while 24% returned alignments of unknown function, leaving up to 46% with no alignment. Functional classification using FunCat revealed that sequences with putative known function were distributed cross 18 categories. All unigenes have an approximately equal distribution on chromosomes by aligning with tomato (Solanum lycopersicum) pseudomolecules. Furthermore, 1,536 high quality single nucleotide discrepancies were discovered using the Bukang mature fruit cDNA collection (dbEST ID: 23667) as a reference. Moreover, 758 simple sequence repeat (SSR) motif loci were mined from 614 contigs, from which 572 primer sets were designed. The SSR motifs corresponded to di- and tri- nucleotide motifs (27.03 and 61.92%, respectively). These molecular markers may be of great value for application in linkage mapping and association mapping research.
高通量测序合成的转录组是分子标记的良好资源。在这项研究中,我们利用 454 GS-FLX 焦磷酸测序技术展示了大规模平行测序合成在分析红辣椒(Capsicum annuum L. TF68)转录组方面的应用。通过生成大约 30.63 兆碱基(Mb)的表达序列标签(EST)数据,原始读数组装得到 9818 个 contigs 和 23712 个 singletons。通过与 NCBI 非冗余数据库和 UniProt 蛋白质数据库的 BLAST 比对,30%的暂定共识序列被分配到特定的功能注释,而 24%的序列返回未知功能的比对,剩下的 46%没有比对。使用 FunCat 进行功能分类表明,具有假定已知功能的序列分布在 18 个类别中。所有的 unigenes 在与番茄(Solanum lycopersicum)拟南芥同源的染色体上的分布大致相等。此外,使用 Bukang 成熟果实 cDNA 文库(dbEST ID:23667)作为参考,发现了 1536 个高质量的单核苷酸差异。此外,从 614 个 contigs 中挖掘出 758 个简单序列重复(SSR) motif 基因座,其中设计了 572 对引物。SSR 基序对应于二核苷酸和三核苷酸基序(分别为 27.03%和 61.92%)。这些分子标记可能对连锁作图和关联作图研究具有重要价值。