Schraiber Joshua G, Evans Steven N, Slatkin Montgomery
Department of Genome Sciences, University of Washington, Seattle, Washington 98195
Department of Statistics, University of California, Berkeley, California Department of Mathematics, University of California, Berkeley, California.
Genetics. 2016 May;203(1):493-511. doi: 10.1534/genetics.116.187278. Epub 2016 Mar 23.
The advent of accessible ancient DNA technology now allows the direct ascertainment of allele frequencies in ancestral populations, thereby enabling the use of allele frequency time series to detect and estimate natural selection. Such direct observations of allele frequency dynamics are expected to be more powerful than inferences made using patterns of linked neutral variation obtained from modern individuals. We developed a Bayesian method to make use of allele frequency time series data and infer the parameters of general diploid selection, along with allele age, in nonequilibrium populations. We introduce a novel path augmentation approach, in which we use Markov chain Monte Carlo to integrate over the space of allele frequency trajectories consistent with the observed data. Using simulations, we show that this approach has good power to estimate selection coefficients and allele age. Moreover, when applying our approach to data on horse coat color, we find that ignoring a relevant demographic history can significantly bias the results of inference. Our approach is made available in a C++ software package.
可获取的古DNA技术的出现,现在使得直接确定祖先群体中的等位基因频率成为可能,从而能够利用等位基因频率时间序列来检测和估计自然选择。这种对等位基因频率动态的直接观察,预计比使用从现代个体获得的连锁中性变异模式所做的推断更具效力。我们开发了一种贝叶斯方法,以利用等位基因频率时间序列数据,并推断非平衡群体中一般二倍体选择的参数以及等位基因年龄。我们引入了一种新颖的路径增强方法,即使用马尔可夫链蒙特卡罗方法在与观测数据一致的等位基因频率轨迹空间上进行积分。通过模拟,我们表明这种方法在估计选择系数和等位基因年龄方面具有良好的效力。此外,当将我们的方法应用于马毛色数据时,我们发现忽略相关的种群历史会显著使推断结果产生偏差。我们的方法以一个C++软件包的形式提供。