Di Chenlu, Ramesh Swetha, Ernst Jason, Lohmueller Kirk E
Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.
Department of Computer Science, Samueli School of Engineering, University of California, Los Angeles, CA, USA.
bioRxiv. 2025 May 14:2025.05.14.654124. doi: 10.1101/2025.05.14.654124.
While annotations of noncoding regions in the human genome are increasing, the fitness effects of mutations in these regions remain unclear. Here, we leverage these functional genomic annotations and human polymorphism data to infer the distributions of fitness effects of new noncoding mutations in humans. Our novel approach controls for mutation rate variation and linked selection along the genome. We find distinct patterns of selection in putative enhancers, promoters, and conserved noncoding regions. While mutations in enhancers are often neutral, approximately 30% of mutations in promoters are deleterious. The most conserved noncoding regions, showing reduced divergence across mammals and primates, have the highest proportion of deleterious mutations. Notably, while we infer the most conserved sites across mammals and primates are enriched for deleterious mutations, such conserved sites only account for a minority of the deleterious mutations in noncoding regions. For example, the top 5% of conserved noncoding sites encompass fewer than 20% of deleterious mutations, indicating that functional noncoding regions vary widely in the distribution of their evolutionary constraint. Our findings highlight the dynamic evolution of gene regulation and shifting selection pressures over deep evolutionary timescales. Consistent with this finding, we infer mutations in ~7-9% of the noncoding genome are deleterious. These insights have broad implications for using comparative genomics to identify non-neutrally evolving sequences in the human genome.
虽然人类基因组中非编码区的注释在不断增加,但这些区域突变对适应性的影响仍不清楚。在此,我们利用这些功能基因组注释和人类多态性数据来推断人类中新的非编码突变对适应性影响的分布。我们的新方法控制了全基因组的突变率变化和连锁选择。我们在假定的增强子、启动子和保守非编码区发现了不同的选择模式。增强子中的突变通常是中性的,而启动子中约30%的突变是有害的。在哺乳动物和灵长类中显示出较低分化的最保守非编码区,有害突变的比例最高。值得注意的是,虽然我们推断哺乳动物和灵长类中最保守的位点富含有害突变,但这些保守位点仅占非编码区有害突变的少数。例如,最保守的非编码位点的前5%包含的有害突变不到20%,这表明功能性非编码区在其进化约束分布上差异很大。我们的发现突出了基因调控的动态进化以及在深度进化时间尺度上不断变化的选择压力。与此发现一致,我们推断约7 - 9%的非编码基因组中的突变是有害的。这些见解对于利用比较基因组学来识别人类基因组中非中性进化的序列具有广泛的意义。