Duchowicz Pablo R
Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, La Plata 1900, Argentina.
Cells. 2018 Feb 14;7(2):13. doi: 10.3390/cells7020013.
A structurally diverse dataset of 530 polo-like kinase-1 (PLK1) inhibitors is compiled from the ChEMBL database and studied by means of a conformation-independent quantitative structure-activity relationship (QSAR) approach. A large number (26,761) of molecular descriptors are explored with the main intention of capturing the most relevant structural characteristics affecting the bioactivity. The structural descriptors are derived with different freeware, such as PaDEL, Mold², and QuBiLs-MAS; such descriptor software complements each other and improves the QSAR results. The best multivariable linear regression models are found with the replacement method variable subset selection technique. The balanced subsets method partitions the dataset into training, validation, and test sets. It is found that the proposed linear QSAR model improves previously reported models by leading to a simpler alternative structure-activity relationship.
从ChEMBL数据库中收集了一个包含530种polo样激酶-1(PLK1)抑制剂的结构多样的数据集,并采用一种与构象无关的定量构效关系(QSAR)方法进行研究。探索了大量(26,761个)分子描述符,主要目的是捕捉影响生物活性的最相关结构特征。结构描述符由不同的免费软件(如PaDEL、Mold²和QuBiLs-MAS)导出;此类描述符软件相互补充,提高了QSAR结果。使用替换法变量子集选择技术找到了最佳的多变量线性回归模型。平衡子集法将数据集划分为训练集、验证集和测试集。结果发现,所提出的线性QSAR模型通过得出更简单的替代构效关系,改进了先前报道的模型。