Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas, Av. Montañana 1.005, Zaragoza, Spain.
Proteins. 2010 Jan;78(1):52-62. doi: 10.1002/prot.22525.
Specific protein-DNA interactions are central to a wide group of processes in the cell and have been studied both experimentally and computationally over the years. Despite the increasing collection of protein-DNA complexes, so far only a few studies have aimed at dissecting the structural characteristics of DNA binding among evolutionarily related proteins. Some questions that remain to be answered are: (a) what is the contribution of the different readout mechanisms in members of a given structural superfamily, (b) what is the degree of interface similarity among superfamily members and how this affects binding specificity, (c) how DNA-binding protein superfamilies distribute across taxa, and (d) is there a general or family-specific code for the recognition of DNA. We have recently developed a straightforward method to dissect the interface of protein-DNA complexes at the atomic level and here we apply it to study 175 proteins belonging to nine representative superfamilies. Our results indicate that evolutionarily unrelated DNA-binding domains broadly conserve specificity statistics, such as the ratio of indirect/direct readout and the frequency of atomic interactions, therefore supporting the existence of a set of recognition rules. It is also found that interface conservation follows trends that are superfamily-specific. Finally, this article identifies tendencies in the phylogenetic distribution of transcription factors, which might be related to the evolution of regulatory networks, and postulates that the modular nature of zinc finger proteins can explain its role in large genomes, as it allows for larger binding interfaces in a single protein molecule.
特定的蛋白质与 DNA 的相互作用是细胞中广泛的一系列过程的核心,多年来,人们已经从实验和计算两个方面对其进行了研究。尽管已经收集了越来越多的蛋白质-DNA 复合物,但迄今为止,只有少数研究旨在剖析进化相关蛋白质之间的 DNA 结合的结构特征。仍有待回答的一些问题是:(a) 在给定结构超家族的成员中,不同的读取机制的贡献是什么,(b) 超家族成员之间的界面相似程度如何,以及这如何影响结合特异性,(c) DNA 结合蛋白超家族在分类群中的分布情况,以及 (d) 是否存在用于识别 DNA 的通用或家族特异性代码。我们最近开发了一种在原子水平上剖析蛋白质-DNA 复合物界面的简单方法,在这里我们将其应用于研究属于九个代表性超家族的 175 种蛋白质。我们的结果表明,进化上不相关的 DNA 结合结构域广泛保守特异性统计数据,例如间接/直接读取的比例和原子相互作用的频率,因此支持存在一组识别规则。还发现界面保守性遵循超家族特异性的趋势。最后,本文确定了转录因子在系统发育分布中的趋势,这可能与调控网络的进化有关,并假设锌指蛋白的模块化性质可以解释其在大型基因组中的作用,因为它允许在单个蛋白质分子中具有更大的结合界面。