Teshima Kosuke M, Coop Graham, Przeworski Molly
Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA.
Genome Res. 2006 Jun;16(6):702-12. doi: 10.1101/gr.5105206. Epub 2006 May 10.
The beneficial substitution of an allele shapes patterns of genetic variation at linked sites. Thus, in principle, adaptations can be mapped by looking for the signature of directional selection in polymorphism data. In practice, such efforts are hampered by the need for an accurate characterization of the demographic history of the species and of the effects of positive selection. In an attempt to circumvent these difficulties, researchers are increasingly taking a purely empirical approach, in which a large number of genomic regions are ordered by summaries of the polymorphism data, and loci with extreme values are considered to be likely targets of positive selection. We evaluated the reliability of the "empirical" approach, focusing on applications to human data and to maize. To do so, we considered a coalescent model of directional selection in a sensible demographic setting, allowing for selection on standing variation as well as on a new mutation. Our simulations suggest that while empirical approaches will identify several interesting candidates, they will also miss many--in some cases, most--loci of interest. The extent of the trade-off depends on the mode of positive selection and the demographic history of the population. Specifically, the false-discovery rate is higher when directional selection involves a recessive rather than a co-dominant allele, when it acts on a previously neutral rather than a new allele, and when the population has experienced a population bottleneck rather than maintained a constant size. One implication of these results is that, insofar as attributes of the beneficial mutation (e.g., the dominance coefficient) affect the power to detect targets of selection, genomic scans will yield an unrepresentative subset of loci that contribute to adaptations.
等位基因的有益替换塑造了连锁位点的遗传变异模式。因此,原则上,可以通过在多态性数据中寻找定向选择的特征来定位适应性。实际上,由于需要准确描述物种的种群历史以及正选择的影响,这些努力受到了阻碍。为了规避这些困难,研究人员越来越多地采用一种纯粹的经验方法,即根据多态性数据的总结对大量基因组区域进行排序,并将具有极端值的位点视为正选择的可能目标。我们评估了这种“经验”方法的可靠性,重点关注其在人类数据和玉米数据中的应用。为此,我们考虑了在合理的种群背景下的定向选择的合并模型,允许对现有变异以及新突变进行选择。我们的模拟表明,虽然经验方法会识别出一些有趣的候选位点,但它们也会遗漏许多——在某些情况下,是大多数——感兴趣的位点。这种权衡的程度取决于正选择的模式和种群的种群历史。具体而言,当定向选择涉及隐性而非共显性等位基因时、当它作用于先前中性而非新的等位基因时、以及当种群经历过种群瓶颈而非保持恒定大小时,错误发现率会更高。这些结果的一个含义是,就有益突变的属性(例如显性系数)会影响检测选择目标的能力而言,基因组扫描将产生一个不具代表性的位点子集,这些位点有助于适应性。