Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden.
Department of Plant Biology, Michigan State University, Lansing, MI, USA.
Mol Biol Evol. 2021 Dec 9;38(12):5563-5575. doi: 10.1093/molbev/msab270.
Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.
准确估计新突变的全基因组速率和适应度效应对于深入了解分子进化过程至关重要。尽管真核基因组通常包含大量的非编码部分,但功能非编码区域以及这些区域中突变的适应度效应仍未得到充分描述。一种有前途的方法是通过识别与调控 DNA 紧密相关的可及染色质区域(ACRs)来表征功能非编码区域。在这里,我们应用这种方法来鉴定和估计拟南芥中 ACR 的选择,拟南芥是一种十字花科植物,由于其有利的种群动态,非常适合进行种群基因组选择量化。我们描述了基于 16 个个体的叶片样本的 ATAC-seq 数据的全种群 ACR 分布,这些个体来自一个自然种群。我们使用种群基因组学方法来估计 ACR 中的适应度效应和正选择固定(α)的比例,并发现基因间 ACR 含有相当一部分弱有害新突变,以及与可比不可及基因间区域相比,强有害突变的比例显著更高。ACRs 富含表达数量性状基因座(eQTL),并且缺乏转座元件插入,这与如果基因间 ACR 受到选择,因为它们含有调控区域的预期结果一致。通过将基因间 ACR 的实证鉴定与 eQTL 分析以及选择的种群基因组分析相结合,我们证明了基因间调控区域是几乎中性突变的重要来源。这些结果提高了我们对非编码区域选择以及近中性突变在异花授粉十字花科物种进化过程中的作用的理解。