Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India.
Department of Biological Sciences, Purdue University, West Lafayette, IN, United States; Department of Computer Science, Purdue University, West Lafayette, IN, United States.
Methods. 2023 May;213:10-17. doi: 10.1016/j.ymeth.2023.03.002. Epub 2023 Mar 15.
Protein-DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/.
蛋白质与 DNA 的相互作用在基因表达、复制和转录等各种生物过程中起着重要作用。了解决定蛋白质与 DNA 复合物结合亲和力的重要特征并预测其亲和力对于阐明它们的识别机制非常重要。在这项工作中,我们收集了 391 个蛋白质与 DNA 复合物的实验结合自由能(ΔG),并推导出了几种基于结构的特征,如相互作用能、接触势能、结合位点残基的体积和表面积、DNA 的碱基步参数以及不同类型原子之间的接触。我们对结合亲和力与结构特征之间的关系进行了分析,结果表明,重要因素主要取决于 DNA 链的数量以及蛋白质的功能和结构类别。具体而言,结合位点的特性,如 DNA 与蛋白质之间的原子接触数量、蛋白质结合位点的体积以及基于相互作用的特征,如相互作用能和接触势能,对于理解结合亲和力很重要。此外,我们还针对不同结构和功能类别的蛋白质与 DNA 复合物开发了多个回归方程来预测其结合亲和力。在 Jackknife 测试中,我们的方法在实验和预测的结合亲和力之间显示出平均相关性和平均绝对误差分别为 0.78 和 0.98 kcal/mol。我们开发了一个名为 PDA-PreD(蛋白质-DNA 结合亲和力预测器)的网络服务器,用于预测蛋白质-DNA 复合物的亲和力,该服务器可在 https://web.iitm.ac.in/bioinfo2/pdapred/ 上免费获得。