Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro 76230, México.
Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Genetics. 2022 Apr 4;220(4). doi: 10.1093/genetics/iyac002.
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some nonequilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.
最近在人类中进行的具有大样本量的基因组测序研究发现了大量低频变体,为分析选择如何作用于人类遗传变异提供了重要信息来源。为了估计低频变体所受自然选择的强度,我们开发了一种基于似然的方法,该方法使用携带低频变体的单倍型之间的成对身份状态长度。我们表明,在某些非平衡群体(例如最近经历过种群扩张的群体)中,可以区分一组变体上的正选择或负选择。有了我们的新框架,人们可以推断出在特定频率下作用于一组变体的固定选择强度,或者可以推断出固定变异和新突变的选择系数分布。我们展示了我们的方法在 UK10K 个体相位单倍型数据集上的应用。