Seierstad Mark, Agrafiotis Dimitris K
Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Row, San Diego, CA 92121, USA.
Chem Biol Drug Des. 2006 Apr;67(4):284-96. doi: 10.1111/j.1747-0285.2006.00379.x.
Over the past decade, the pharmaceutical industry has begun to address an addition to ADME/Tox profiling--the ability of a compound to bind to and inhibit the human ether-a-go-go-related gene (hERG)-encoded cardiac potassium channel. With the compilation of a large and diverse set of compounds measured in a single, consistent hERG channel inhibition assay, we recognized a unique opportunity to attempt to construct predictive QSAR models. Early efforts with classification models built from this training set were very encouraging. Here, we report a systematic evaluation of regression models based on neural network ensembles in conjunction with a variety of structure representations and feature selection algorithms. The combination of these modeling techniques (neural networks to capture non-linear relationships in the data, feature selection to prevent over-fitting, and aggregation to minimize model instability) was found to produce models with very good internal cross-validation statistics and good predictivity on external data.
在过去十年中,制药行业已开始关注药物代谢动力学/药物毒性(ADME/Tox)分析的一个补充内容——化合物与人类醚 - 去极化相关基因(hERG)编码的心脏钾通道结合并抑制其活性的能力。通过在单一、一致的hERG通道抑制试验中对大量多样的化合物进行测定,我们认识到一个独特的机会,即尝试构建预测性定量构效关系(QSAR)模型。基于该训练集构建分类模型的早期努力非常令人鼓舞。在此,我们报告了基于神经网络集成结合多种结构表示和特征选择算法对回归模型进行的系统评估。结果发现,这些建模技术的组合(神经网络用于捕捉数据中的非线性关系、特征选择用于防止过拟合以及聚合用于最小化模型不稳定性)能够产生具有非常好的内部交叉验证统计数据且对外部数据具有良好预测性的模型。