Shen Min, LeTiran Arnaud, Xiao Yunde, Golbraikh Alexander, Kohn Harold, Tropsha Alexander
Division of Medicinal Chemistry and Natural Products, School of Pharmacy, CB# 7360, University of North Carolina, Chapel Hill, NC 27599-7360, USA.
J Med Chem. 2002 Jun 20;45(13):2811-23. doi: 10.1021/jm010488u.
We report the development of rigorously validated quantitative structure-activity relationship (QSAR) models for 48 chemically diverse functionalized amino acids with anticonvulsant activity. Two variable selection approaches, simulated annealing partial least squares (SA-PLS) and k nearest neighbor (kNN), were employed. Both methods utilize multiple descriptors such as molecular connectivity indices or atom pair descriptors, which are derived from two-dimensional molecular topology. QSAR models with high internal accuracy were generated, with leave-one-out cross-validated R(2) (q(2)) values ranging between 0.6 and 0.8. The q(2) values for the actual dataset were significantly higher than those obtained for the same dataset with randomly shuffled activity values, indicating that models were statistically significant. The original dataset was further divided into several training and test sets, with highly predictive models providing q(2) values greater than 0.5 for the training sets and R(2) values greater than 0.6 for the test sets. These models were capable of predicting with reasonable accuracy the activity of 13 novel compounds not included in the original dataset. The successful development of highly predictive QSAR models affords further design and discovery of novel anticonvulsant agents.
我们报告了针对48种具有抗惊厥活性的化学结构多样的官能化氨基酸,开发出经过严格验证的定量构效关系(QSAR)模型。采用了两种变量选择方法,即模拟退火偏最小二乘法(SA-PLS)和k近邻法(kNN)。这两种方法都利用了多种描述符,如分子连接性指数或原子对描述符,这些描述符源自二维分子拓扑结构。生成了具有高内部准确性的QSAR模型,留一法交叉验证的R(2)(q(2))值在0.6至0.8之间。实际数据集的q(2)值显著高于对相同数据集随机打乱活性值后获得的q(2)值,表明模型具有统计学意义。原始数据集进一步划分为几个训练集和测试集,高度预测性的模型为训练集提供了大于0.5的q(2)值,为测试集提供了大于0.6的R(2)值。这些模型能够以合理的准确性预测原始数据集中未包含的13种新型化合物的活性。高度预测性QSAR模型的成功开发为新型抗惊厥药物的进一步设计和发现提供了支持。