De Paris Renata, Quevedo Christian V, Ruiz Duncan D A, Norberto de Souza Osmar
Grupo de Pesquisa em Inteligência de Negócio-GPIN, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681-Prédio 32, sala 628, Porto Alegre, RS, Brasil.
Laboratório de Bioinformática, Modelagem e Simulação de Biossistemas-LABIO, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681- Building 32, Room 602, Porto Alegre, RS, Brasil.
PLoS One. 2015 Jul 28;10(7):e0133172. doi: 10.1371/journal.pone.0133172. eCollection 2015.
Protein receptor conformations, obtained from molecular dynamics (MD) simulations, have become a promising treatment of its explicit flexibility in molecular docking experiments applied to drug discovery and development. However, incorporating the entire ensemble of MD conformations in docking experiments to screen large candidate compound libraries is currently an unfeasible task. Clustering algorithms have been widely used as a means to reduce such ensembles to a manageable size. Most studies investigate different algorithms using pairwise Root-Mean Square Deviation (RMSD) values for all, or part of the MD conformations. Nevertheless, the RMSD only may not be the most appropriate gauge to cluster conformations when the target receptor has a plastic active site, since they are influenced by changes that occur on other parts of the structure. Hence, we have applied two partitioning methods (k-means and k-medoids) and four agglomerative hierarchical methods (Complete linkage, Ward's, Unweighted Pair Group Method and Weighted Pair Group Method) to analyze and compare the quality of partitions between a data set composed of properties from an enzyme receptor substrate-binding cavity and two data sets created using different RMSD approaches. Ensembles of representative MD conformations were generated by selecting a medoid of each group from all partitions analyzed. We investigated the performance of our new method for evaluating binding conformation of drug candidates to the InhA enzyme, which were performed by cross-docking experiments between a 20 ns MD trajectory and 20 different ligands. Statistical analyses showed that the novel ensemble, which is represented by only 0.48% of the MD conformations, was able to reproduce 75% of all dynamic behaviors within the binding cavity for the docking experiments performed. Moreover, this new approach not only outperforms the other two RMSD-clustering solutions, but it also shows to be a promising strategy to distill biologically relevant information from MD trajectories, especially for docking purposes.
通过分子动力学(MD)模拟获得的蛋白质受体构象,在应用于药物发现与开发的分子对接实验中,因其显著的灵活性而成为一种很有前景的治疗方法。然而,在对接实验中纳入MD构象的整个集合以筛选大型候选化合物库,目前是一项不可行的任务。聚类算法已被广泛用作将此类集合减少到可管理大小的一种手段。大多数研究使用所有或部分MD构象的成对均方根偏差(RMSD)值来研究不同的算法。然而,当目标受体具有可塑性活性位点时,仅RMSD可能不是聚类构象的最合适指标,因为它们会受到结构其他部分发生的变化的影响。因此,我们应用了两种划分方法(k均值和k中心点)和四种凝聚层次方法(完全连锁、沃德法、非加权配对组方法和加权配对组方法),来分析和比较由酶受体底物结合腔的性质组成的数据集与使用不同RMSD方法创建的两个数据集之间的划分质量。通过从所有分析的划分中选择每组的一个中心点来生成代表性MD构象的集合。我们研究了我们的新方法在评估候选药物与InhA酶结合构象方面的性能,这是通过在20 ns MD轨迹和20种不同配体之间进行交叉对接实验来进行的。统计分析表明,仅由0.48%的MD构象代表的新集合,能够在对接实验中重现结合腔内所有动态行为的75%。此外,这种新方法不仅优于其他两种RMSD聚类解决方案,而且还显示出是一种从MD轨迹中提取生物学相关信息的有前景的策略,特别是用于对接目的。