Department of Biology, University of British Columbia, Okanagan Campus, 3333 University Way, Kelowna BC, V1V 1V7, Canada.
BMC Genomics. 2013 May 7;14:308. doi: 10.1186/1471-2164-14-308.
High throughput next-generation sequencing technology has enabled the collection of genome-wide sequence data and revolutionized single nucleotide polymorphism (SNP) discovery in a broad range of species. When analyzed within a population genomics framework, SNP-based genotypic data may be used to investigate questions of evolutionary, ecological, and conservation significance in natural populations of non-model organisms. Kokanee salmon are recently diverged freshwater populations of sockeye salmon (Oncorhynchus nerka) that exhibit reproductive ecotypes (stream-spawning and shore-spawning) in lakes throughout western North America and northeast Asia. Current conservation and management strategies may treat these ecotypes as discrete stocks, however their recent divergence and low levels of gene flow make in-season genetic stock identification a challenge. The development of genome-wide SNP markers is an essential step towards fine-scale stock identification, and may enable a direct investigation of the genetic basis of ecotype divergence.
We used pooled cDNA samples from both ecotypes of kokanee to generate 750 million base pairs of transcriptome sequence data. These raw data were assembled into 11,074 high coverage contigs from which we identified 32,699 novel single nucleotide polymorphisms. A subset of these putative SNPs was validated using high-resolution melt analysis and Sanger resequencing to genotype independent samples of kokanee and anadromous sockeye salmon. We also identified a number of contigs that were composed entirely of reads from a single ecotype, which may indicate regions of differential gene expression between the two reproductive ecotypes. In addition, we found some evidence for greater pathogen load among the kokanee sampled in stream-spawning habitats, suggesting a possible evolutionary advantage to shore-spawning that warrants further study.
This study provides novel genomic resources to support population genetic and genomic studies of both kokanee and anadromous sockeye salmon, and has the potential to produce markers capable of fine-scale stock assessment. While this RNAseq approach was successful at identifying a large number of new SNP loci, we found that the frequency of alleles present in the pooled transcriptome data was not an accurate predictor of population allele frequencies.
高通量下一代测序技术使全基因组序列数据的收集成为可能,并彻底改变了广泛物种中单核苷酸多态性(SNP)的发现。在群体基因组学框架内分析时,基于 SNP 的基因型数据可用于研究非模型生物自然种群中具有进化、生态和保护意义的问题。虹鳟是刚分化的银大麻哈鱼(Oncorhynchus nerka)的淡水种群,在北美洲西部和东北亚的湖泊中表现出生殖生态型(溪流产卵型和岸滩产卵型)。当前的保护和管理策略可能将这些生态型视为不同的种群,但它们最近的分化和低水平的基因流使得在产卵季节进行遗传种群鉴定具有挑战性。基因组范围 SNP 标记的开发是进行精细种群鉴定的重要步骤,并且可以直接研究生态型分化的遗传基础。
我们使用两种虹鳟的混合 cDNA 样本生成了 7.5 亿个碱基对的转录组序列数据。这些原始数据组装成 11074 个高覆盖率的连续体,从中我们鉴定出 32699 个新的单核苷酸多态性。这些假定 SNP 的一部分使用高分辨率熔解分析和 Sanger 重测序进行了验证,用于对虹鳟和溯河洄游的大麻哈鱼的独立样本进行基因分型。我们还鉴定出一些由单一生态型的读段完全组成的连续体,这可能表明两种生殖生态型之间存在差异表达的基因区域。此外,我们发现一些在溪流产卵生境中取样的虹鳟的病原体负荷较高,这表明岸滩产卵具有潜在的进化优势,值得进一步研究。
本研究为虹鳟和溯河洄游的大麻哈鱼的群体遗传和基因组研究提供了新的基因组资源,并有可能产生能够进行精细种群评估的标记。虽然这种 RNAseq 方法成功地鉴定了大量新的 SNP 位点,但我们发现,在混合转录组数据中存在的等位基因频率并不能准确预测种群的等位基因频率。