Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA and Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel.
Nucleic Acids Res. 2014 Jan;42(1):430-41. doi: 10.1093/nar/gkt862. Epub 2013 Sep 27.
Protein-DNA recognition is a critical component of gene regulatory processes but the underlying molecular mechanisms are not yet completely understood. Whereas the DNA binding preferences of transcription factors (TFs) are commonly described using nucleotide sequences, the 3D DNA structure is recognized by proteins and is crucial for achieving binding specificity. However, the ability to analyze DNA shape in a high-throughput manner made it only recently feasible to integrate structural information into studies of protein-DNA binding. Here we focused on the homeodomain family of TFs and analyzed the DNA shape of thousands of their DNA binding sites, investigating the covariation between the protein sequence and the sequence and shape of their DNA targets. We found distinct homeodomain regions that were more correlated with either the nucleotide sequence or the DNA shape of their preferred binding sites, demonstrating different readout mechanisms through which homeodomains attain DNA binding specificity. We identified specific homeodomain residues that likely play key roles in DNA recognition via shape readout. Finally, we showed that adding DNA shape information when characterizing binding sites improved the prediction accuracy of homeodomain binding specificities. Taken together, our findings indicate that DNA shape information can generally provide new mechanistic insights into TF binding.
蛋白质与 DNA 的相互识别是基因调控过程的关键组成部分,但其中的分子机制尚未完全阐明。虽然转录因子(TFs)的 DNA 结合偏好通常使用核苷酸序列来描述,但蛋白质识别的是 3D DNA 结构,这对实现结合特异性至关重要。然而,近年来高通量分析 DNA 形状的能力使得将结构信息整合到蛋白质-DNA 结合研究中成为可能。在这里,我们重点研究了同源域家族的 TFs,并分析了数千个它们的 DNA 结合位点的 DNA 形状,研究了蛋白质序列与其 DNA 靶标的序列和形状之间的共变关系。我们发现了与核苷酸序列或其首选结合位点的 DNA 形状更相关的独特同源域区域,这表明同源域通过不同的读取机制获得 DNA 结合特异性。我们确定了特定的同源域残基,这些残基可能通过形状读取在 DNA 识别中发挥关键作用。最后,我们表明在描述结合位点时添加 DNA 形状信息可以提高同源域结合特异性预测的准确性。总之,我们的研究结果表明,DNA 形状信息通常可以为 TF 结合提供新的机制见解。