Guo Yan, Cai Qiuyin, Li Chun, Li Jiang, Courtney Regina, Zheng Wei, Long Jirong
Department of Cancer Biology, Vanderbilt University, Nashville TN 37232, USA.
Int J Comput Biol Drug Des. 2013;6(4):279-93. doi: 10.1504/IJCBDD.2013.056709. Epub 2013 Sep 30.
Next generation sequencing technology has matured, and with its current affordability, will replace the SNP chip as the genotyping tool of choice. Even with the current affordability of NGS, large scale studies will require careful study design to reduce cost. In this study, we designed an experiment to assess the accuracy of allele frequency estimated from pooled sequencing data. We compared the allele frequency estimated from sequencing data with the allele frequency estimated from individual SNP chip data and observed high correlations between them. However, by calculating error rate, we found that many SNPs had their allele frequency estimated from sequencing data significantly different from allele frequency estimated from SNP chip data. In conclusion, we found correlation is not an ideal measurement for comparing allele frequencies. And for the purpose of estimating allele frequency, we do not recommend using pooling with NGS as a cheaper alternative to genotype each sample individually.
新一代测序技术已经成熟,鉴于其目前的可承受性,它将取代单核苷酸多态性(SNP)芯片成为首选的基因分型工具。即使在目前新一代测序技术具有可承受性的情况下,大规模研究仍需要精心设计研究方案以降低成本。在本研究中,我们设计了一项实验来评估从混合测序数据估计的等位基因频率的准确性。我们将从测序数据估计的等位基因频率与从单个SNP芯片数据估计的等位基因频率进行了比较,并观察到它们之间具有高度相关性。然而,通过计算错误率,我们发现许多单核苷酸多态性位点从测序数据估计的等位基因频率与从SNP芯片数据估计的等位基因频率存在显著差异。总之,我们发现相关性并非比较等位基因频率的理想指标。并且出于估计等位基因频率的目的,我们不建议将新一代测序技术的混合样本法作为逐个对每个样本进行基因分型的更廉价替代方法。