Department of Biology, University of Virginia, Charlottesville, VA 22903, USA.
Mol Ecol Resour. 2012 Mar;12(2):333-43. doi: 10.1111/j.1755-0998.2011.03079.x. Epub 2011 Oct 17.
Members of the angiosperm genus Silene are widely used in studies of ecology and evolution, but available genomic and population genetic resources within Silene remain limited. Deep transcriptome (i.e. expressed sequence tag or EST) sequencing has proven to be a rapid and cost-effective means to characterize gene content and identify polymorphic markers in non-model organisms. In this study, we report the results of 454 GS-FLX Titanium sequencing of a polyA-selected and normalized cDNA library from Silene vulgaris. The library was generated from a single pool of transcripts, combining RNA from leaf, root and floral tissue from three genetically divergent European subpopulations of S. vulgaris. A single full-plate 454 run produced 959,520 reads totalling 363.6 Mb of sequence data with an average read length of 379.0 bp after quality trimming and removal of custom library adaptors. We assembled 832,251 (86.7%) of these reads into 40,964 contigs, which have a total length of 25.4 Mb and can be organized into 18,178 graph-based clusters or 'isogroups'. Assembled sequences were annotated based on homology to genes in multiple public databases. Analysis of sequence variants identified 13,432 putative single-nucleotide polymorphisms (SNPs) and 1320 simple sequence repeats (SSRs) that are candidates for microsatellite analysis. Estimates of nucleotide diversity from 1577 contigs were used to generate genome-wide distributions that revealed several outliers with high diversity. All of these resources are publicly available through NCBI and/or our website (http://silenegenomics.biology.virginia.edu) and should provide valuable genomic and population genetic tools for the Silene research community.
被子植物石竹属植物被广泛应用于生态和进化研究,但石竹属内可用的基因组和群体遗传资源仍然有限。深度转录组(即表达序列标签或 EST)测序已被证明是一种快速且具有成本效益的方法,可以用于描述非模式生物的基因组成,并鉴定多态性标记。在这项研究中,我们报告了来自普通石竹的 polyA 选择和标准化 cDNA 文库的 454 GS-FLX Titanium 测序结果。该文库由来自三个遗传上不同的欧洲石竹亚种的叶、根和花组织的单个转录本池组合而成。单个全板 454 运行产生了 959,520 条读取,总共产生了 363.6 Mb 的序列数据,在质量修剪和去除定制文库接头后,平均读取长度为 379.0 bp。我们将这些读取中的 832,251 条(86.7%)组装成 40,964 个 contigs,总长度为 25.4 Mb,可以组织成 18,178 个基于图的聚类或“同基因簇”。组装的序列基于与多个公共数据库中的基因的同源性进行注释。序列变异分析鉴定了 13,432 个假定的单核苷酸多态性(SNP)和 1320 个简单序列重复(SSR),它们是微卫星分析的候选者。来自 1577 个 contigs 的核苷酸多样性估计值用于生成全基因组分布,揭示了几个具有高度多样性的异常值。所有这些资源都可通过 NCBI 和/或我们的网站(http://silenegenomics.biology.virginia.edu)公开获得,应为石竹研究社区提供有价值的基因组和群体遗传工具。