Li Hang, Wang Maolin, Gong Ya-Nan, Yan Aixia
State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, P.O. Box 53, Beijing University of Chemical Technology, 15 BeiSanHuan East Road, Beijing 100029, P.R. China.
Comb Chem High Throughput Screen. 2016;19(6):470-80. doi: 10.2174/1386207319666160504095621.
β-secretase (BACE1) is an aspartyl protease, which is considered as a novel vital target in Alzheimer`s disease therapy. We collected a data set of 294 BACE1 inhibitors, and built six classification models to discriminate active and weakly active inhibitors using Kohonen's Self-Organizing Map (SOM) method and Support Vector Machine (SVM) method. Each molecular descriptor was calculated using the program ADRIANA.Code. We adopted two different methods: random method and Self-Organizing Map method, for training/test set split. The descriptors were selected by F-score and stepwise linear regression analysis. The best SVM model Model2C has a good prediction performance on test set with prediction accuracy, sensitivity (SE), specificity (SP) and Matthews correlation coefficient (MCC) of 89.02%, 90%, 88%, 0.78, respectively. Model 1A is the best SOM model, whose accuracy and MCC of the test set were 94.57% and 0.98, respectively. The lone pair electronegativity and polarizability related descriptors importantly contributed to bioactivity of BACE1 inhibitor. The Extended-Connectivity Finger-Prints_4 (ECFP_4) analysis found some vitally key substructural features, which could be helpful for further drug design research. The SOM and SVM models built in this study can be obtained from the authors by email or other contacts.
β-分泌酶(BACE1)是一种天冬氨酸蛋白酶,被认为是阿尔茨海默病治疗中的一个重要新靶点。我们收集了一个包含294种BACE1抑制剂的数据集,并使用Kohonen自组织映射(SOM)方法和支持向量机(SVM)方法建立了六个分类模型,以区分活性抑制剂和弱活性抑制剂。每个分子描述符均使用ADRIANA.Code程序进行计算。我们采用了两种不同的方法:随机方法和自组织映射方法,用于训练集/测试集划分。通过F分数和逐步线性回归分析来选择描述符。最佳的支持向量机模型Model2C在测试集上具有良好的预测性能,预测准确率、灵敏度(SE)、特异性(SP)和马修斯相关系数(MCC)分别为89.02%、90%、88%、0.78。模型1A是最佳的SOM模型,其测试集的准确率和MCC分别为94.57%和0.98。孤对电子的电负性和极化率相关描述符对BACE1抑制剂的生物活性有重要贡献。扩展连接指纹_4(ECFP_4)分析发现了一些至关重要的关键子结构特征,这可能有助于进一步的药物设计研究。本研究中建立的SOM和SVM模型可通过电子邮件或其他联系方式从作者处获得。