Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA.
Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA.
Mol Biol Evol. 2018 Aug 1;35(8):1958-1967. doi: 10.1093/molbev/msy099.
Noncoding DNA sequences, which play various roles in gene expression and regulation, are under evolutionary pressure. Gene regulation requires specific protein-DNA binding events, and our previous studies showed that both DNA sequence and shape readout are employed by transcription factors (TFs) to achieve DNA binding specificity. By investigating the shape-disrupting properties of single nucleotide polymorphisms (SNPs) in human regulatory regions, we established a link between disruptive local DNA shape changes and loss of specific TF binding. Furthermore, we described cases where disease-associated SNPs may alter TF binding through DNA shape changes. This link led us to hypothesize that local DNA shape within and around TF binding sites is under selection pressure. To verify this hypothesis, we analyzed SNP data derived from 216 natural strains of Drosophila melanogaster. Comparing SNPs located in functional and nonfunctional regions within experimentally validated cis-regulatory modules (CRMs) from D. melanogaster that are active in the blastoderm stage of development, we found that SNPs within functional regions tended to cause smaller DNA shape variations. Furthermore, SNPs with higher minor allele frequency were more likely to result in smaller DNA shape variations. The same analysis based on a large number of SNPs in putative CRMs of the D. melanogaster genome derived from DNase I accessibility data confirmed these observations. Taken together, our results indicate that common SNPs in functional regions tend to maintain DNA shape, whereas shape-disrupting SNPs are more likely to be eliminated through purifying selection.
非编码 DNA 序列在基因表达和调控中发挥着各种作用,它们受到进化压力的影响。基因调控需要特定的蛋白-DNA 结合事件,我们之前的研究表明,转录因子(TFs)既利用 DNA 序列,也利用 DNA 形状来实现 DNA 结合的特异性。通过研究人类调控区域中单核苷酸多态性(SNP)的破坏性质,我们建立了局部 DNA 形状变化与特定 TF 结合丧失之间的联系。此外,我们描述了一些与疾病相关的 SNP 可能通过 DNA 形状变化改变 TF 结合的情况。这种联系使我们假设 TF 结合位点及其周围的局部 DNA 形状受到选择压力的影响。为了验证这一假设,我们分析了来自 216 种黑腹果蝇自然种群的 SNP 数据。我们比较了位于实验验证的顺式调控模块(CRM)中功能和非功能区域的 SNP,这些 CRM 在胚胎发育的胚盘阶段是活跃的,我们发现功能区域内的 SNP 往往导致较小的 DNA 形状变化。此外,具有较高次要等位基因频率的 SNP 更有可能导致较小的 DNA 形状变化。基于来自 DNase I 可及性数据的大量黑腹果蝇基因组中假定 CRM 的 SNP 的相同分析也证实了这些观察结果。总之,我们的研究结果表明,功能区域中的常见 SNP 倾向于保持 DNA 形状,而破坏 DNA 形状的 SNP 更有可能通过纯化选择被消除。