Department of Biosystems, Faculty of Bioscience Engineering, Katholieke Universiteit Leuven - KULeuven, Kasteelpark Arenberg 30, B-3001, Heverlee, Belgium.
Chem Biol Drug Des. 2013 Dec;82(6):685-96. doi: 10.1111/cbdd.12196. Epub 2013 Sep 19.
Quantitative structure-activity relationship (QSAR) modeling was performed for imidazo[1,5-a]pyrido[3,2-e]pyrazines, which constitute a class of phosphodiesterase 10A inhibitors. Particle swarm optimization (PSO) and genetic algorithm (GA) were used as feature selection techniques to find the most reliable molecular descriptors from a large pool. Modeling of the relationship between the selected descriptors and the pIC50 activity data was achieved by linear [multiple linear regression (MLR)] and non-linear [locally weighted regression (LWR) based on both Euclidean (E) and Mahalanobis (M) distances] methods. In addition, a stepwise MLR model was built using only a limited number of quantum chemical descriptors, selected because of their correlation with the pIC50 . The model was not found interesting. It was concluded that the LWR model, based on the Euclidean distance, applied on the descriptors selected by PSO has the best prediction ability. However, some other models behaved similarly. The root-mean-squared errors of prediction (RMSEP) for the test sets obtained by PSO/MLR, GA/MLR, PSO/LWRE, PSO/LWRM, GA/LWRE, and GA/LWRM models were 0.333, 0.394, 0.313, 0.333, 0.421, and 0.424, respectively. The PSO-selected descriptors resulted in the best prediction models, both linear and non-linear.
定量构效关系 (QSAR) 模型是针对咪唑并[1,5-a]吡啶并[3,2-e]吡嗪类化合物构建的,这些化合物是磷酸二酯酶 10A 抑制剂的一个类别。粒子群优化 (PSO) 和遗传算法 (GA) 被用作特征选择技术,从大量特征中找到最可靠的分子描述符。通过线性 [多元线性回归 (MLR)] 和非线性 [基于欧几里得 (E) 和马哈拉诺比斯 (M) 距离的局部加权回归 (LWR)] 方法,对所选描述符与 pIC50 活性数据之间的关系进行建模。此外,还构建了一个仅使用数量有限的量子化学描述符的逐步 MLR 模型,这些描述符是根据与 pIC50 的相关性选择的。该模型并没有发现有趣的结果。得出的结论是,基于欧几里得距离的 LWR 模型,应用于 PSO 选择的描述符,具有最佳的预测能力。然而,其他一些模型的表现也类似。通过 PSO/MLR、GA/MLR、PSO/LWRE、PSO/LWRM、GA/LWRE 和 GA/LWRM 模型获得的测试集的预测均方根误差 (RMSEP) 分别为 0.333、0.394、0.313、0.333、0.421 和 0.424。PSO 选择的描述符产生了最佳的预测模型,包括线性和非线性模型。