West Virginia University, Animal and Nutritional Sciences, Morgantown, WV, 26506, USA.
BMC Genomics. 2009 Nov 25;10:559. doi: 10.1186/1471-2164-10-559.
To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population.
The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts.
The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.
为了提高虹鳟鱼基因组分析的能力,例如基因组选择,必须鉴定出大量适用于高通量基因分型协议的多态性标记。表达序列标签(EST)已用于鲑鱼目的单核苷酸多态性(SNP)发现。在这些策略中,鲑鱼半四倍体基因组通常导致了同源序列的组装,因此导致了假阳性 SNP 鉴定率很高。使用从 EST 鉴定的引物对基因组 DNA 进行测序已被证明是一种有效的但耗时的虹鳟鱼 SNP 鉴定方法,因此不适合高通量 SNP 发现。在这项研究中,我们采用了一种高通量策略,该策略使用焦磷酸测序技术从用来自代表国家冷水和冷水水产养殖中心(NCCCWA)亲鱼种群的 96 个无关虹鳟鱼的基因组 DNA 池构建的简化代表文库中生成数据。
简化的代表文库由完全消化限制酶 HaeIII 产生的 440bp 片段组成;测序产生了 200 万个读段,提供了估计的 150000 个独特基因组限制片段(300000 个片段末端)的 6 倍覆盖。三个独立的数据分析在 13140 至 24627 个独立的重叠群上鉴定出 22022 至 47128 个假定 SNP。从三个分析中产生的一组 384 个假定 SNP 随机选择在个体鱼上进行基因分型,以确定分析中假定 SNP 的验证率,区分实际上代表四倍体基因组中同源基因座的明显 SNP,检查孟德尔分离,并将验证的 SNP 置于虹鳟鱼连锁图谱上。大约 48%(183)的假定 SNP 得到了验证;167 个标记成功地整合到虹鳟鱼连锁图谱中。此外,验证标记的序列中有 2%与虹鳟鱼转录本相关。
使用简化的代表文库和焦磷酸测序技术证明是发现虹鳟鱼中大量假定 SNP 的有效策略;然而,为了减少由进化上最近的基因组复制引起的假阳性 SNP 鉴定率,对该技术进行修改将是可取的。