Gorlov Ivan P, Gorlova Olga Y, Amos Christopher I
The Geisel School of Medicine, Dartmouth College, Dartmouth-Hitchcock Medical Center, Lebanon, New Hampshire, United States of America.
PLoS Genet. 2015 Jul 22;11(7):e1005371. doi: 10.1371/journal.pgen.1005371. eCollection 2015 Jul.
Genome-wide association studies (GWAS) have generated sufficient data to assess the role of selection in shaping allelic diversity of disease-associated SNPs. Negative selection against disease risk variants is expected to reduce their frequencies making them overrepresented in the group of minor (<50%) alleles. Indeed, we found that the overall proportion of risk alleles was higher among alleles with frequency <50% (minor alleles) compared to that in the group of major alleles. We hypothesized that negative selection may have different effects on environment (or lifestyle)-dependent versus environment (or lifestyle)-independent diseases. We used an environment/lifestyle index (ELI) to assess influence of environmental/lifestyle factors on disease etiology. ELI was defined as the number of publications mentioning "environment" or "lifestyle" AND disease per 1,000 disease-mentioning publications. We found that the frequency distributions of the risk alleles for the diseases with strong environmental/lifestyle components follow the distribution expected under a selectively neutral model, while frequency distributions of the risk alleles for the diseases with weak environmental/lifestyle influences is shifted to the lower values indicating effects of negative selection. We hypothesized that previously selectively neutral variants become risk alleles when environment changes. The hypothesis of ancestrally neutral, currently disadvantageous risk-associated alleles predicts that the distribution of risk alleles for the environment/lifestyle dependent diseases will follow a neutral model since natural selection has not had enough time to influence allele frequencies. The results of our analysis suggest that prediction of SNP functionality based on the level of evolutionary conservation may not be useful for SNPs associated with environment/lifestyle dependent diseases.
全基因组关联研究(GWAS)已经产生了足够的数据来评估选择在塑造疾病相关单核苷酸多态性(SNP)等位基因多样性中的作用。对疾病风险变异的负选择预计会降低其频率,使其在次要(<50%)等位基因组中过度代表。事实上,我们发现,与主要等位基因组相比,频率<50%的等位基因(次要等位基因)中风险等位基因的总体比例更高。我们假设,负选择可能对环境(或生活方式)依赖性疾病和环境(或生活方式)非依赖性疾病有不同影响。我们使用环境/生活方式指数(ELI)来评估环境/生活方式因素对疾病病因的影响。ELI定义为每1000篇提及疾病的出版物中提及“环境”或“生活方式”以及疾病的出版物数量。我们发现,具有强烈环境/生活方式成分的疾病的风险等位基因频率分布遵循选择性中性模型下预期的分布,而环境/生活方式影响较弱的疾病的风险等位基因频率分布则向较低值偏移,表明存在负选择效应。我们假设,当环境变化时,以前的选择性中性变异会成为风险等位基因。祖先中性、当前不利的风险相关等位基因假说预测,环境/生活方式依赖性疾病的风险等位基因分布将遵循中性模型,因为自然选择没有足够的时间影响等位基因频率。我们的分析结果表明,基于进化保守水平预测SNP功能可能对与环境/生活方式依赖性疾病相关的SNP无用。