Turner Joseph V, Cutler David J, Spence Ian, Maddalena Desmond J
Faculty of Pharmacy, The University of Sydney, NSW 2006, Australia.
J Comput Chem. 2003 May;24(7):891-7. doi: 10.1002/jcc.10148.
Selection of optimal descriptors in quantitative structure-activity-property relationship (QSAR/QSPR) studies has been a perennial problem. Artificial Neural Networks (ANNs) have been used widely in QSAR/QSPR studies but less widely in descriptor selection. The current study used ANNs to select an optimal set of descriptors using large numbers of input variables. The effects of clean, noisy, and random input descriptors with linear, nonlinear, and periodic data on synthetic and real data QSAR/QSPR sets were examined. The optimal set of descriptors could be determined using a signal-to-noise ratio method. The optimal values for the rho parameter, which relates sample size to network architecture, were found to vary with the type of data. ANNs were able to detect meaningful descriptors in the presence of large numbers of random false descriptors.
在定量构效关系(QSAR)/定量构性关系(QSPR)研究中,选择最优描述符一直是个长期存在的问题。人工神经网络(ANN)在QSAR/QSPR研究中已被广泛应用,但在描述符选择方面的应用较少。当前研究使用人工神经网络,通过大量输入变量来选择一组最优描述符。研究考察了具有线性、非线性和周期性数据的纯净、有噪声和随机输入描述符对合成数据和真实数据QSAR/QSPR集的影响。可以使用信噪比方法确定最优描述符集。发现将样本大小与网络架构相关联的rho参数的最优值会因数据类型而异。在存在大量随机错误描述符的情况下,人工神经网络能够检测出有意义的描述符。
J Comput Chem. 2003-5
J Chem Inf Model. 2008-11
Methods Mol Biol. 2008
J Chem Inf Model. 2007