Baldwin-Brown James G, Long Anthony D, Thornton Kevin R
Department of Ecology and Evolutionary Biology, University of California, Irvine.
Mol Biol Evol. 2014 Apr;31(4):1040-55. doi: 10.1093/molbev/msu048. Epub 2014 Jan 18.
A novel approach for dissecting complex traits is to experimentally evolve laboratory populations under a controlled environment shift, resequence the resulting populations, and identify single nucleotide polymorphisms (SNPs) and/or genomic regions highly diverged in allele frequency. To better understand the power and localization ability of such an evolve and resequence (E&R) approach, we carried out forward-in-time population genetics simulations of 1 Mb genomic regions under a large combination of experimental conditions, then attempted to detect significantly diverged SNPs. Our analysis indicates that the ability to detect differentiation between populations is primarily affected by selection coefficient, population size, number of replicate populations, and number of founding haplotypes. We estimate that E&R studies can detect and localize causative sites with 80% success or greater when the number of founder haplotypes is over 500, experimental populations are replicated at least 25-fold, population size is at least 1,000 diploid individuals, and the selection coefficient on the locus of interest is at least 0.1. More achievable experimental designs (less replicated, fewer founder haplotypes, smaller effective population size, and smaller selection coefficients) can have power of greater than 50% to identify a handful of SNPs of which one is likely causative. Similarly, in cases where s ≥ 0.2, less demanding experimental designs can yield high power.
一种剖析复杂性状的新方法是在可控的环境变化下对实验室群体进行实验进化,对所得群体进行重测序,并识别等位基因频率高度分化的单核苷酸多态性(SNP)和/或基因组区域。为了更好地理解这种进化与重测序(E&R)方法的效能和定位能力,我们在大量实验条件组合下对1 Mb基因组区域进行了正向时间群体遗传学模拟,然后尝试检测显著分化的SNP。我们的分析表明,检测群体间分化的能力主要受选择系数、群体大小、重复群体数量和奠基单倍型数量的影响。我们估计,当奠基单倍型数量超过500、实验群体至少重复25倍、群体大小至少为1000个二倍体个体且目标位点的选择系数至少为0.1时,E&R研究能够以80%或更高的成功率检测并定位致病位点。更可行的实验设计(重复次数较少、奠基单倍型较少、有效群体大小较小以及选择系数较小)能够有超过50%的效能识别少数几个SNP,其中一个可能是致病的。同样,在s≥0.2的情况下,要求较低的实验设计也能产生较高的效能。