Faculty of Pharmaceutical Sciences, M.D. University, Rohtak, Haryana, India.
Chem Biol Drug Des. 2012 Jan;79(1):38-52. doi: 10.1111/j.1747-0285.2011.01264.x. Epub 2011 Nov 28.
Four highly discriminating fourth-generation topological indices (TIs), termed as superaugmented eccentric distance sum connectivity indices, as well as their topochemical versions (denoted by , , and ), have been conceptualized in this study. The values of these indices for all possible structures with three, four, and five vertices containing one heteroatom were computed using an in-house computer program. The proposed superaugmented eccentric distance sum connectivity topochemical indices exhibited exceptionally high discriminating power, low degeneracy, and high sensitivity toward both the presence and the relative position of heteroatom(s) for all possible structures with five vertices containing at least one heteroatom. Intercorrelation analysis revealed the absence of correlation of proposed indices with Zagreb indices and the molecular connectivity index. Subsequently, the proposed TIs were successfully utilized for the development of models for the prediction of checkpoint kinase inhibitory activity of 2-arylbenzimidazoles. A data set comprising 47 differently substituted analogs of 2-arylbenzimidazoles was selected for the study. The values of various TIs for each analog in the data set were computed using an in-house computer program. The resulting data were analyzed, and suitable models were developed through decision tree (DT), random forest (RF), and moving average analysis (MAA). The performance of the models was assessed by calculating the specificity, sensitivity, overall accuracy, and Mathew's correlation coefficient. A decision tree was constructed for the checkpoint kinase inhibitory activity to determine the importance of topological indices. The decision tree identified the proposed TIs -, - as the most important indices. The decision tree learned the information from the input data with an accuracy of 96% and correctly predicted the cross-validated (10-fold) data with an accuracy of 77%. Random forest correctly predicted the checkpoint kinase inhibitory activity with an accuracy of 83%. The single index-based models were also developed for the prediction of checkpoint kinase inhibitory activity using MAA. The accuracy of prediction of single index-based models derived through MAA was found to vary from a minimum of 90% to a maximum of 95%. Exceptionally high discriminating power, low degeneracy, and high sensitivity toward branching and presence of heteroatom of proposed indices can be of immense use in drug design, isomer discrimination, similarity/dissimilarity studies, quantitative structure activity/property relationships, lead optimization, and combinatorial library design.
在本研究中,构想了四个高度区分的第四代拓扑指数(TI),称为超增偏心距离和连通性指数,以及它们的拓扑版本(分别表示为 、 和 )。使用内部计算机程序计算了这些指数对于所有可能的具有三个、四个和五个顶点且包含一个杂原子的结构的数值。对于所有可能的具有至少一个杂原子的五个顶点的结构,所提出的超增偏心距离和连通拓扑指数表现出极高的区分能力、低简并度和对杂原子的存在和相对位置的高灵敏度。相关性分析表明,所提出的指数与扎格指数和分子连接性指数之间不存在相关性。随后,成功地将所提出的 TI 用于开发预测 2-芳基苯并咪唑类化合物的检查点激酶抑制活性的模型。选择了包含 47 种不同取代的 2-芳基苯并咪唑类似物的数据集进行研究。使用内部计算机程序计算了数据集中每个类似物的各种 TI 值。分析了所得数据,并通过决策树(DT)、随机森林(RF)和移动平均分析(MAA)开发了合适的模型。通过计算特异性、敏感性、总准确性和马修相关系数来评估模型的性能。构建了一个决策树来确定拓扑指数在检查点激酶抑制活性中的重要性。决策树确定了所提出的 TI- 、- 为最重要的指数。决策树从输入数据中学习信息,准确率为 96%,正确预测交叉验证(10 倍)数据的准确率为 77%。随机森林正确预测检查点激酶抑制活性的准确率为 83%。还使用 MAA 为预测检查点激酶抑制活性开发了基于单指数的模型。通过 MAA 得出的基于单指数的模型的预测准确性从最低的 90%到最高的 95%不等。所提出的指数具有极高的区分能力、低简并度以及对分支和杂原子存在的高灵敏度,可广泛用于药物设计、异构体鉴别、相似性/差异性研究、定量构效关系、先导化合物优化和组合库设计。