Zhang Shuxing, Golbraikh Alexander, Tropsha Alexander
The Laboratory for Molecular Modeling, Division of Medicinal Chemistry and Natural Products, School of Pharmacy, University of North Carolina at Chapel Hill, North Carolina 27599-7360, USA.
J Med Chem. 2006 May 4;49(9):2713-24. doi: 10.1021/jm050260x.
Novel geometrical chemical descriptors have been derived on the basis of the computational geometry of protein-ligand interfaces and Pauling atomic electronegativities (EN). Delaunay tessellation has been applied to a diverse set of 517 X-ray characterized protein-ligand complexes yielding a unique collection of interfacial nearest neighbor atomic quadruplets for each complex. Each quadruplet composition was characterized by a single descriptor calculated as the sum of the EN values for the four participating atom types. We termed these simple descriptors generated from atomic EN values and derived with the Delaunay Tessellation the ENTess descriptors and used them in the variable selection k-nearest neighbor quantitative structure-binding affinity relationship (QSBR) studies of 264 diverse protein-ligand complexes with known binding constants. Twenty-four complexes with chemically dissimilar ligands were set aside as an independent validation set, and the remaining dataset of 240 complexes was divided into multiple training and test sets. The best models were characterized by the leave-one-out cross-validated correlation coefficient q(2) as high as 0.66 for the training set and the correlation coefficient R(2) as high as 0.83 for the test set. The high predictive power of these models was confirmed independently by applying them to the validation set of 24 complexes yielding R(2) as high as 0.85. We conclude that QSBR models built with the ENTess descriptors can be instrumental for predicting the binding affinity of receptor-ligand complexes.
基于蛋白质-配体界面的计算几何和鲍林原子电负性(EN),推导了新型几何化学描述符。德劳内三角剖分已应用于517个经X射线表征的蛋白质-配体复合物的多样集合,为每个复合物产生了独特的界面最近邻原子四重体集合。每个四重体组成由单个描述符表征,该描述符计算为四种参与原子类型的EN值之和。我们将这些由原子EN值生成并通过德劳内三角剖分推导得到的简单描述符称为ENTess描述符,并将其用于264个具有已知结合常数的多样蛋白质-配体复合物的变量选择k近邻定量结构-结合亲和力关系(QSBR)研究中。将24个具有化学不同配体的复合物留作独立验证集,其余240个复合物的数据集被分为多个训练集和测试集。最佳模型的特征在于,训练集的留一法交叉验证相关系数q(2)高达0.66,测试集的相关系数R(2)高达0.83。通过将这些模型应用于24个复合物的验证集,独立证实了这些模型的高预测能力,R(2)高达0.85。我们得出结论,用ENTess描述符构建的QSBR模型有助于预测受体-配体复合物的结合亲和力。