Baskin Igor I, Palyulin Vladimir A, Zefirov Nikolai S
Department of Chemistry, Moscow State University, Russia.
Methods Mol Biol. 2008;458:137-58.
This chapter critically reviews some of the important methods being used for building quantitative structure-activity relationship (QSAR) models using the artificial neural networks (ANNs). It attends predominantly to the use of multilayer ANNs in the regression analysis of structure-activity data. The highlighted topics cover the approximating ability of ANNs, the interpretability of the resulting models, the issues of generalization and memorization, the problems of overfitting and overtraining, the learning dynamics, regularization, and the use of neural network ensembles. The next part of the chapter focuses attention on the use of descriptors. It reviews different descriptor selection and preprocessing techniques; considers the use of the substituent, substructural, and superstructural descriptors in building common QSAR models; the use of molecular field descriptors in three-dimensional QSAR studies; along with the prospects of "direct" graph-based QSAR analysis. The chapter starts with a short historical survey of the main milestones in this area.
本章批判性地回顾了一些利用人工神经网络(ANN)构建定量构效关系(QSAR)模型的重要方法。它主要关注多层ANN在构效数据回归分析中的应用。突出的主题包括ANN的逼近能力、所得模型的可解释性、泛化和记忆问题、过拟合和过度训练问题、学习动态、正则化以及神经网络集成的使用。本章的下一部分重点关注描述符的使用。它回顾了不同的描述符选择和预处理技术;考虑了在构建通用QSAR模型中取代基、子结构和超结构描述符的使用;在三维QSAR研究中分子场描述符的使用;以及基于“直接”图形的QSAR分析的前景。本章首先对该领域的主要里程碑进行了简短的历史回顾。