Temiz N Alpay, Camacho Carlos J
Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Nucleic Acids Res. 2009 Jul;37(12):4076-88. doi: 10.1093/nar/gkp289. Epub 2009 May 8.
A major obstacle towards understanding the molecular basis of transcriptional regulation is the lack of a recognition code for protein-DNA interactions. Using high-quality crystal structures and binding data on the promiscuous family of C(2)H(2) zinc fingers (ZF), we decode 10 fundamental specific interactions responsible for protein-DNA recognition. The interactions include five hydrogen bond types, three atomic desolvation penalties, a favorable non-polar energy, and a novel water accessibility factor. We apply this code to three large datasets containing a total of 89 C(2)H(2) transcription factor (TF) mutants on the three ZFs of EGR. Guided by molecular dynamics simulations of individual ZFs, we map the interactions into homology models that embody all feasible intra- and intermolecular bonds, selecting for each sequence the structure with the lowest free energy. These interactions reproduce the change in affinity of 35 mutants of finger I (R(2) = 0.998), 23 mutants of finger II (R(2) = 0.96) and 31 finger III human domains (R(2) = 0.94). Our findings reveal recognition rules that depend on DNA sequence/structure, molecular water at the interface and induced fit of the C(2)H(2) TFs. Collectively, our method provides the first robust framework to decode the molecular basis of TFs binding to DNA.
理解转录调控分子基础的一个主要障碍是缺乏蛋白质 - DNA相互作用的识别密码。利用关于C(2)H(2)锌指(ZF)混杂家族的高质量晶体结构和结合数据,我们解码了10种负责蛋白质 - DNA识别的基本特异性相互作用。这些相互作用包括五种氢键类型、三种原子去溶剂化惩罚、一种有利的非极性能量以及一种新的水可及性因子。我们将此密码应用于三个大型数据集,这些数据集总共包含89个关于早期生长反应蛋白(EGR)三个锌指的C(2)H(2)转录因子(TF)突变体。在单个锌指的分子动力学模拟指导下,我们将这些相互作用映射到体现所有可行的分子内和分子间键的同源模型中,为每个序列选择自由能最低的结构。这些相互作用再现了手指I的35个突变体(R(2) = 0.998)、手指II的23个突变体(R(2) = 0.96)和31个人类手指III结构域(R(2) = 0.94)亲和力的变化。我们的研究结果揭示了依赖于DNA序列/结构、界面处的分子水以及C(2)H(2)转录因子诱导契合的识别规则。总体而言,我们的方法提供了第一个可靠的框架来解码转录因子与DNA结合的分子基础。