Cai Jie, Li Chanjuan, Liu Zhihong, Du Jiewen, Ye Jiming, Gu Qiong, Xu Jun
Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.
Lipid Biology and Metabolic Disease Health Innovations Research Institute, RMIT University, PO Box 71, Melbourne, VIC, 3083, Australia.
J Comput Aided Mol Des. 2017 Apr;31(4):393-402. doi: 10.1007/s10822-017-0009-6. Epub 2017 Feb 2.
Dipeptidyl peptidase IV (DPP-IV) is a promising Type 2 diabetes mellitus (T2DM) drug target. DPP-IV inhibitors prolong the action of glucagon-like peptide-1 (GLP-1) and gastric inhibitory peptide (GIP), improve glucose homeostasis without weight gain, edema, and hypoglycemia. However, the marketed DPP-IV inhibitors have adverse effects such as nasopharyngitis, headache, nausea, hypersensitivity, skin reactions and pancreatitis. Therefore, it is still expected for novel DPP-IV inhibitors with minimal adverse effects. The scaffolds of existing DPP-IV inhibitors are structurally diversified. This makes it difficult to build virtual screening models based upon the known DPP-IV inhibitor libraries using conventional QSAR approaches. In this paper, we report a new strategy to predict DPP-IV inhibitors with machine learning approaches involving naïve Bayesian (NB) and recursive partitioning (RP) methods. We built 247 machine learning models based on 1307 known DPP-IV inhibitors with optimized molecular properties and topological fingerprints as descriptors. The overall predictive accuracies of the optimized models were greater than 80%. An external test set, composed of 65 recently reported compounds, was employed to validate the optimized models. The results demonstrated that both NB and RP models have a good predictive ability based on different combinations of descriptors. Twenty "good" and twenty "bad" structural fragments for DPP-IV inhibitors can also be derived from these models for inspiring the new DPP-IV inhibitor scaffold design.
二肽基肽酶IV(DPP-IV)是一种很有前景的2型糖尿病(T2DM)药物靶点。DPP-IV抑制剂可延长胰高血糖素样肽-1(GLP-1)和胃抑制肽(GIP)的作用时间,改善葡萄糖稳态,且不会导致体重增加、水肿和低血糖。然而,已上市的DPP-IV抑制剂存在诸如鼻咽炎、头痛、恶心、过敏、皮肤反应和胰腺炎等不良反应。因此,人们仍期待有不良反应最小的新型DPP-IV抑制剂。现有DPP-IV抑制剂的支架结构具有多样性。这使得使用传统的定量构效关系(QSAR)方法基于已知的DPP-IV抑制剂库构建虚拟筛选模型变得困难。在本文中,我们报告了一种使用涉及朴素贝叶斯(NB)和递归划分(RP)方法的机器学习方法来预测DPP-IV抑制剂的新策略。我们基于1307种已知的DPP-IV抑制剂构建了247个机器学习模型,这些抑制剂具有优化的分子性质和拓扑指纹作为描述符。优化模型的总体预测准确率大于80%。使用由65种最近报道的化合物组成的外部测试集来验证优化模型。结果表明,基于不同描述符组合,NB和RP模型都具有良好的预测能力。还可以从这些模型中得出20个DPP-IV抑制剂的“好”结构片段和20个“坏”结构片段,以启发新型DPP-IV抑制剂支架的设计。