School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332, USA.
Proteome Sci. 2012 Jun 21;10 Suppl 1(Suppl 1):S17. doi: 10.1186/1477-5956-10-S1-S17.
Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality.
In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems.
The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design.
We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem.
蛋白质与 DNA 的对接是结构生物信息学中一个极具挑战性的问题,在许多应用中都具有重要意义,如基于结构的转录因子结合位点预测和合理药物设计。由于能量计算成本高和构象采样算法的统计性质,蛋白质与 DNA 的对接计算量非常大。更重要的是,实验表明对接质量取决于构象采样空间的覆盖范围。因此,不仅需要减少计算时间,还需要提高对接质量,从而加速对接算法的计算。
为了加速采样过程并提高对接性能,我们开发了一种基于图形处理单元 (GPU) 的蛋白质与 DNA 对接算法。该算法采用基于势能的能量函数来描述蛋白质与 DNA 对的结合亲和力,并集成了蒙特卡罗模拟和模拟退火方法来搜索构象空间。开发了算法技术来提高基于 GPU 的高性能计算系统上的计算效率和可扩展性。
我们的方法在一组 75 个非冗余 TF-DNA 复合物和新开发的 TF-DNA 对接基准上进行了有效性测试。我们证明了基于 GPU 的对接算法可以显著加速模拟过程,从而增加找到接近天然 TF-DNA 复合物结构的机会。这项研究还表明,要进一步提高蛋白质与 DNA 对接研究的准确性,需要从两个整体方面努力:提高计算效率和能量函数设计。
我们提出了一种提高蛋白质与 DNA 对接预测准确性的高性能计算方法。基于 GPU 的对接算法加速了构象空间的搜索,从而增加了找到更多接近天然结构的机会。据我们所知,这是首次专门将 GPU 或 GPU 集群应用于蛋白质与 DNA 对接问题的研究。