Kevo Subarctic Research Institute, University of Turku, Turku 20014, Finland.
BMC Genomics. 2013 Jan 16;14:12. doi: 10.1186/1471-2164-14-12.
New sequencing technologies have tremendously increased the number of known molecular markers (single nucleotide polymorphisms; SNPs) in a variety of species. Concurrently, improvements to genotyping technology have now made it possible to efficiently genotype large numbers of genome-wide distributed SNPs enabling genome wide association studies (GWAS). However, genotyping significant numbers of individuals with large number of SNPs remains prohibitively expensive for many research groups. A possible solution to this problem is to determine allele frequencies from pooled DNA samples, such 'allelotyping' has been presented as a cost-effective alternative to individual genotyping and has become popular in human GWAS. In this article we have tested the effectiveness of DNA pooling to obtain accurate allele frequency estimates for Atlantic salmon (Salmo salar L.) populations using an Illumina SNP-chip.
In total, 56 Atlantic salmon DNA pools from 14 populations were analyzed on an Atlantic salmon SNP-chip containing probes for 5568 SNP markers, 3928 of which were bi-allelic. We developed an efficient quality control filter which enables exclusion of loci showing high error rate and minor allele frequency (MAF) close to zero. After applying multiple quality control filters we obtained allele frequency estimates for 3631 bi-allelic loci. We observed high concordance (r > 0.99) between allele frequency estimates derived from individual genotyping and DNA pools. Our results also indicate that even relatively small DNA pools (35 individuals) can provide accurate allele frequency estimates for a given sample.
Despite of higher level of variation associated with array replicates compared to pool construction, we suggest that both sources of variation should be taken into account. This study demonstrates that DNA pooling allows fast and high-throughput determination of allele frequencies in Atlantic salmon enabling cost-efficient identification of informative markers for discrimination of populations at various geographical scales, as well as identification of loci controlling ecologically and economically important traits.
新的测序技术极大地增加了各种物种中已知分子标记(单核苷酸多态性;SNP)的数量。同时,基因分型技术的改进现在使得对大量分布在全基因组的 SNP 进行高效基因分型成为可能,从而实现了全基因组关联研究(GWAS)。然而,对于许多研究小组来说,对大量个体进行大量 SNP 的基因分型仍然过于昂贵。解决此问题的一种可能方法是从混合 DNA 样本中确定等位基因频率,这种“等位基因分型”已被提出作为个体基因分型的一种具有成本效益的替代方法,并在人类 GWAS 中变得流行。在本文中,我们使用 Illumina SNP 芯片测试了 DNA 混合在获得大西洋鲑(Salmo salar L.)群体准确等位基因频率估计中的有效性。
总共分析了来自 14 个群体的 56 个大西洋鲑 DNA 池,这些 DNA 池使用包含 5568 个 SNP 标记的大西洋鲑 SNP 芯片进行分析,其中 3928 个是双等位基因。我们开发了一种有效的质量控制过滤器,可以排除显示高错误率和接近零的次要等位基因频率(MAF)的基因座。在应用多个质量控制过滤器后,我们获得了 3631 个双等位基因座的等位基因频率估计值。我们观察到个体基因分型和 DNA 池衍生的等位基因频率估计之间具有高度一致性(r > 0.99)。我们的结果还表明,即使是相对较小的 DNA 池(35 个个体)也可以为给定样本提供准确的等位基因频率估计。
尽管与池构建相比,阵列重复存在更高水平的变异,但我们建议应考虑这两种变异来源。这项研究表明,DNA 混合允许快速且高通量地确定大西洋鲑中的等位基因频率,从而可以在各种地理尺度上有效地识别种群的信息标记,以及识别控制生态和经济重要性状的基因座。