So S S, Karplus M
Department of Chemistry, Harvard University, Cambridge, Massachusetts 02138, USA.
J Med Chem. 1996 Mar 29;39(7):1521-30. doi: 10.1021/jm9507035.
A new hybrid method (GNN) combining a genetic algorithm and an artificial neural network has been developed for quantitative structure-activity relationship (QSAR) studies. A suitable set of molecular descriptors are selected by a genetic algorithm. This set serves as input to a neural network, in which model-free mapping of multivariate data is performed. Multiple predictors are generated that are superior to results obtained from previous studies of the Selwood data set, which is used to test the method. The neural network technique provides a graphical description of the functional form of the descriptors that play an important role in determining drug activity. This can serve as an aid in future design of drug analogues. The effectiveness of GNN is tested by comparing its results with a benchmark obtained by exhaustive enumeration. Different fitness strategies that tune the evolution of genetic models are examined, and QSARs with higher predictiveness are found. From these results, a composite model is constructed by averaging predictions from several high-ranking models. The predictions of the resulting QSAR should be more reliable than those derived from a single predictor because it makes greater use of information and also permits error estimation. An analysis of the sets of descriptors selected by GNN shows that it is essential to have one each for the steric, electrostatic, and hydrophobic attributes of a drug candidate to obtain a satisfactory QSAR for this data set. This type of result is expected to be of general utility in designing and understanding QSAR.
一种结合遗传算法和人工神经网络的新型混合方法(GNN)已被开发用于定量构效关系(QSAR)研究。通过遗传算法选择一组合适的分子描述符。这组描述符作为神经网络的输入,在神经网络中进行多变量数据的无模型映射。生成了多个预测器,其优于之前对用于测试该方法的塞尔伍德数据集研究所得的结果。神经网络技术提供了对在确定药物活性中起重要作用的描述符功能形式的图形化描述。这可有助于未来药物类似物的设计。通过将GNN的结果与通过穷举枚举获得的基准进行比较来测试其有效性。研究了调整遗传模型进化的不同适应度策略,并发现了具有更高预测性的QSAR。根据这些结果,通过对几个高级模型的预测进行平均构建了一个复合模型。所得QSAR的预测应该比从单个预测器得出的预测更可靠,因为它更多地利用了信息并且还允许进行误差估计。对GNN选择的描述符集的分析表明,对于该数据集,为候选药物的空间、静电和疏水属性各有一个描述符对于获得令人满意的QSAR至关重要。这种类型的结果预计在设计和理解QSAR方面具有普遍实用性。