Marabotti Anna, Spyrakis Francesca, Facchiano Angelo, Cozzini Pietro, Alberti Saverio, Kellogg Glen E, Mozzarelli Andrea
Laboratory for Bioinformatics and Computational Biology, Institute of Food Science, National Research Council, Avellino, Italy.
J Comput Chem. 2008 Sep;29(12):1955-69. doi: 10.1002/jcc.20954.
Despite decades of investigations, it is not yet clear whether there are rules dictating the specificity of the interaction between amino acids and nucleotide bases. This issue was addressed by determining, in a dataset consisting of 100 high-resolution protein-DNA structures, the frequency and energy of interaction between each amino acid and base, and the energetics of water-mediated interactions. The analysis was carried out using HINT, a non-Newtonian force field encoding both enthalpic and entropic contributions, and Rank, a geometry-based tool for evaluating hydrogen bond interactions. A frequency- and energy-based preferential interaction of Arg and Lys with G, Asp and Glu with C, and Asn and Gln with A was found. Not only favorable, but also unfavorable contacts were found to be conserved. Water-mediated interactions strongly increase the probability of Thr-A, Lys-A, and Lys-C contacts. The frequency, interaction energy, and water enhancement factors associated with each amino acid-base pair were used to predict the base triplet recognized by the helix motif in 45 zinc fingers, which represents an ideal case study for the analysis of one-to-one amino acid-base pair contacts. The model correctly predicted 70.4% of 135 amino acid-base pairs, and, by weighting the energetic relevance of each amino acid-base pair to the overall recognition energy, it yielded a prediction rate of 89.7%.
尽管经过了数十年的研究,但氨基酸与核苷酸碱基之间相互作用的特异性是否存在规律仍不明确。通过在一个由100个高分辨率蛋白质-DNA结构组成的数据集中,确定每个氨基酸与碱基之间相互作用的频率和能量,以及水介导相互作用的能量学,来解决这个问题。分析使用了HINT(一种编码焓和熵贡献的非牛顿力场)和Rank(一种基于几何的评估氢键相互作用的工具)。发现了基于频率和能量的精氨酸和赖氨酸与鸟嘌呤、天冬氨酸和谷氨酸与胞嘧啶、天冬酰胺和谷氨酰胺与腺嘌呤之间的优先相互作用。不仅有利的接触,而且不利的接触也被发现是保守的。水介导的相互作用强烈增加了苏氨酸-腺嘌呤、赖氨酸-腺嘌呤和赖氨酸-胞嘧啶接触的概率。与每个氨基酸-碱基对相关的频率、相互作用能量和水增强因子被用于预测45个锌指中螺旋基序识别的碱基三联体,这代表了一对一氨基酸-碱基对接触分析的理想案例研究。该模型正确预测了135个氨基酸-碱基对中的70.4%,并且通过权衡每个氨基酸-碱基对与整体识别能量的能量相关性,其预测率达到了89.7%。