Kim RyangGuk, Guo Jun-tao
Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC 28223 USA.
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-10-S1-S13.
Knowledge of protein-DNA interactions at the structural-level can provide insights into the mechanisms of protein-DNA recognition and gene regulation. Although over 1400 protein-DNA complex structures have been deposited into Protein Data Bank (PDB), the structural details of protein-DNA interactions are generally not available. In addition, current approaches to comparison of protein-DNA complexes are mainly based on protein sequence similarity while the DNA sequences are not taken into account. With the number of experimentally-determined protein-DNA complex structures increasing, there is a need for an automatic program to analyze the protein-DNA complex structures and to provide comprehensive structural information for the benefit of the whole research community.
We developed an automatic and comprehensive protein-DNA complex structure analysis program, PDA (for protein-DNA complex structure analyzer). PDA takes PDB files as inputs and performs structural analysis that includes 1) whole protein-DNA complex structure restoration, especially the reconstruction of double-stranded DNA structures; 2) an efficient new approach for DNA base-pair detection; 3) systematic annotation of protein-DNA interactions; and 4) extraction of DNA subsequences involved in protein-DNA interactions and identification of protein-DNA binding units. Protein-DNA complex structures in current PDB were processed and analyzed with our PDA program and the analysis results were stored in a database. A dataset useful for studying protein-DNA interactions involved in gene regulation was generated using both protein and DNA sequences as well as the contact information of the complexes. WebPDA was developed to provide a web interface for using PDA and for data retrieval.
PDA is a computational tool for structural annotations of protein-DNA complexes. It provides a useful resource for investigating protein-DNA interactions. Data from the PDA analysis can also facilitate the classification of protein-DNA complexes and provide insights into rational design of benchmarks. The PDA program is freely available at http://bioinfozen.uncc.edu/webpda.
在结构层面了解蛋白质 - DNA 相互作用能够为蛋白质 - DNA 识别机制和基因调控提供深入见解。尽管已有超过 1400 个蛋白质 - DNA 复合物结构存入蛋白质数据库(PDB),但蛋白质 - DNA 相互作用的结构细节通常难以获取。此外,当前比较蛋白质 - DNA 复合物的方法主要基于蛋白质序列相似性,而未考虑 DNA 序列。随着实验测定的蛋白质 - DNA 复合物结构数量不断增加,需要一个自动程序来分析蛋白质 - DNA 复合物结构,并为整个研究群体提供全面的结构信息。
我们开发了一个自动且全面的蛋白质 - DNA 复合物结构分析程序 PDA(蛋白质 - DNA 复合物结构分析仪)。PDA 以 PDB 文件作为输入,并进行结构分析,包括:1)整个蛋白质 - DNA 复合物结构的恢复,特别是双链 DNA 结构的重建;2)一种高效的新 DNA 碱基对检测方法;3)蛋白质 - DNA 相互作用的系统注释;4)提取参与蛋白质 - DNA 相互作用的 DNA 子序列并识别蛋白质 - DNA 结合单元。使用我们的 PDA 程序对当前 PDB 中的蛋白质 - DNA 复合物结构进行了处理和分析,分析结果存储在一个数据库中。利用蛋白质和 DNA 序列以及复合物的接触信息生成了一个有助于研究基因调控中蛋白质 - DNA 相互作用的数据集。开发了 WebPDA 以提供使用 PDA 和数据检索的网络界面。
PDA 是一种用于蛋白质 - DNA 复合物结构注释的计算工具。它为研究蛋白质 - DNA 相互作用提供了有用的资源。PDA 分析的数据还可促进蛋白质 - DNA 复合物的分类,并为基准的合理设计提供见解。PDA 程序可从 http://bioinfozen.uncc.edu/webpda 免费获取。