Ahn Yul-Kyun, Tripathi Swati, Cho Young-Il, Kim Jeong-Ho, Lee Hye-Eun, Kim Do-Sun, Woo Jong-Gyu, Cho Myeong-Cheoul
Vegetable Research Division, National Institute of Horticultural & Herbal Science, Rural Development Administration, Suwon, 440-706, Republic of Korea.
Bot Stud. 2013 Dec;54(1):58. doi: 10.1186/1999-3110-54-58. Epub 2013 Nov 21.
Pepper, Capsicum annuum L., Solanaceae, is a major staple economically important vegetable crop worldwide. Limited functional genomics resources and whole genome association studies could be substantially improved through the application of molecular approach for the characterization of gene content and identification of molecular markers. The massive parallel pyrosequencing of two pepper varieties, the highly pungent, Saengryeg 211, and the non-pungent, Saengryeg 213, including de novo transcriptome assembly, functional annotation, and in silico discovery of potential molecular markers is described. We performed 454 GS-FLX Titanium sequencing of polyA-selected and normalized cDNA libraries generated from a single pool of transcripts obtained from mature fruits of two pepper varieties.
A single 454 pyrosequencing run generated 361,671 and 274,269 reads totaling 164.49 and 124.60 Mb of sequence data (average read length of 454 nucleotides), which assembled into 23,821 and 17,813 isotigs and 18,147 and 15,129 singletons for both varieties, respectively. These reads were organized into 20,352 and 15,781 'isogroups' for both varieties. Assembled sequences were functionally annotated based on homology to genes in multiple public databases and assigned with Gene Ontology (GO) terms. Sequence variants analyses identified a total of 3,766 and 2,431 potential (Simple Sequence Repeat) SSR motifs for microsatellite analysis for both varieties, where trinucleotide was the most common repeat unit (84%), followed by di (9.9%), hexa (4.1%) and pentanucleotide repeats (2.1%). GAA repeat (8.6%) was the most frequent repeat motif, followed by TGG (7.2%), TTC (6.5%), and CAG (6.2%).
High-throughput transcriptome assembly, annotation and large scale of SSR marker discovery has been achieved using next generation sequencing (NGS) of two pepper varieties. These valuable informations for functional genomics resource shall help to further improve the pepper breeding efforts with respect to genetic linkage maps, QTL mapping and marker-assisted trait selection.
辣椒(Capsicum annuum L.)属于茄科,是全球主要的经济作物之一,也是一种重要的蔬菜作物。通过应用分子方法来表征基因含量和鉴定分子标记,可以显著改善有限的功能基因组学资源和全基因组关联研究。本文描述了对两个辣椒品种——高辣度的“生烈211”和无辣味的“生烈213”进行大规模平行焦磷酸测序的过程,包括从头转录组组装、功能注释以及潜在分子标记的电子发现。我们对从两个辣椒品种成熟果实中获得的单一组转录本生成的聚腺苷酸选择和标准化cDNA文库进行了454 GS-FLX Titanium测序。
一次454焦磷酸测序运行分别产生了361,671条和274,269条读数,序列数据总量分别为164.49 Mb和124.60 Mb(平均读数长度为454个核苷酸),两个品种分别组装成23,821个和17,813个重叠群以及18,147个和15,129个单拷贝序列。这些读数被组织成两个品种的20,352个和15,781个“同组”。基于与多个公共数据库中基因的同源性对组装序列进行功能注释,并赋予基因本体(GO)术语。序列变异分析为两个品种共鉴定出3,766个和2,431个潜在的(简单序列重复)SSR基序用于微卫星分析,其中三核苷酸是最常见的重复单元(84%),其次是二核苷酸(9.9%)、六核苷酸(4.1%)和五核苷酸重复(2.1%)。GAA重复(8.6%)是最常见的重复基序,其次是TGG(7.2%)、TTC(6.5%)和CAG(6.2%)。
通过对两个辣椒品种进行下一代测序(NGS),实现了高通量转录组组装、注释和大规模SSR标记发现。这些功能基因组学资源的宝贵信息将有助于进一步改进辣椒育种工作,包括构建遗传连锁图谱、进行QTL定位和标记辅助性状选择。