Pourbasheer Eslam, Vahdani Saadat, Malekzadeh Davood, Aalizadeh Reza, Ebadi Amin
Department of Chemistry, Payame Noor University Tehran, Iran.
Department of Chemistry, Islamic Azad University-North Tehran Branch, Tehran, Iran.
Iran J Pharm Res. 2017 Summer;16(3):966-980.
The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitors can be used to efficiently target it. In the present study, the multiple linear regression (MLR), and support vector machine (SVM) methods were used to interpret the chemical structural functionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structural information were described through various types of molecular descriptors and genetic algorithm (GA) was applied to decrease the complexity of inhibition pathway to a few relevant molecular descriptors. Non-linear method (GA-SVM) showed to be better than the linear (GA-MLR) method in terms of the internal and the external prediction accuracy. The SVM model, with high statistical significance (R = 0.938; R = 0.870), was found to be useful for estimating the inhibition activity of 17β-HSD3 inhibitors. The models were validated rigorously through leave-one-out cross-validation and several compounds as external test set. Furthermore, the external predictive power of the proposed model was examined by considering modified R and concordance correlation coefficient values, Golbraikh and Tropsha acceptable model criteria's, and an extra evaluation set from an external data set. Applicability domain of the linear model was carefully defined using Williams plot. Moreover, Euclidean based applicability domain was applied to define the chemical structural diversity of the evaluation set and training set.
17β-羟类固醇脱氢酶3(17β-HSD3)在前列腺癌治疗中起关键作用,小分子抑制剂可有效靶向该酶。在本研究中,采用多元线性回归(MLR)和支持向量机(SVM)方法来阐释化学结构功能与某些17β-HSD3抑制剂抑制活性之间的关系。通过各类分子描述符描述化学结构信息,并应用遗传算法(GA)降低抑制途径的复杂性,以得到少数几个相关分子描述符。就内部和外部预测准确性而言,非线性方法(GA-SVM)比线性方法(GA-MLR)表现更好。具有高统计学显著性(R = 0.938;R = 0.870)的SVM模型被发现可用于估算17β-HSD3抑制剂的抑制活性。通过留一法交叉验证和几种化合物作为外部测试集对模型进行了严格验证。此外,通过考虑修正的R值和一致性相关系数值、Golbraikh和Tropsha可接受模型标准以及来自外部数据集的额外评估集,检验了所提模型的外部预测能力。使用Williams图仔细定义了线性模型的适用域。此外,基于欧几里得距离的适用域被用于定义评估集和训练集的化学结构多样性。