Gleeson M Paul
Computational, Analytical & Structural Sciences, GlaxoSmithKline Medicines Research Centre, Stevenage, Hertfordshire, SG1 2NY, United Kingdom.
J Med Chem. 2007 Jan 11;50(1):101-12. doi: 10.1021/jm060981b.
In-silico plasma protein binding (PPB) models have been generated on human and rat in-house datasets, and on a human dataset sourced from the literature. From the results reported herein, it is apparent that models built on datasets relevant to the chemotypes under investigation in lead optimization programs will perform considerably better in this role than those generated on diverse compounds sourced from the literature. The in-house human and rat partial least-squares regression (PLS) models have cross-validated q2 values of 0.53 and 0.42 on the training sets, respectively. On the independent test and validation sets, they display similar predictive ability, with logK prediction errors of approximately 0.5 log units. This compares to approximately 0.25 log units variability expected for experiment. Given the considerable interspecies PPB differences, the prediction of PPB in one species using measurements in the other is no better than a prediction from an in-silico model generated on that species.
已基于内部人类和大鼠数据集以及从文献中获取的人类数据集生成了计算机模拟血浆蛋白结合(PPB)模型。从本文报告的结果来看,显然基于先导优化计划中所研究化学型相关数据集构建的模型,在这一任务中的表现要比基于从文献中获取的各种化合物生成的模型好得多。内部人类和大鼠偏最小二乘回归(PLS)模型在训练集上的交叉验证q2值分别为0.53和0.42。在独立测试集和验证集上,它们表现出相似的预测能力,logK预测误差约为0.5个对数单位。相比之下,实验预期的变异性约为0.25个对数单位。鉴于种间PPB存在显著差异,利用一个物种的测量值来预测另一个物种的PPB并不比基于该物种生成的计算机模拟模型的预测效果更好。