Senese Craig L, Duca J, Pan D, Hopfinger A J, Tseng Y J
Laboratory of Molecular Modeling and Design (MC 781), College of Pharmacy, The University of Illinois at Chicago, 833 South Wood Street, Chicago, Illinois 60612-7231, USA.
J Chem Inf Comput Sci. 2004 Sep-Oct;44(5):1526-39. doi: 10.1021/ci049898s.
An elusive goal in the field of chemoinformatics and molecular modeling has been the generation of a set of descriptors that, once calculated for a molecule, may be used in a wide variety of applications. Since such universal descriptors are generated free from external constraints, they are inherently independent of the data set in which they are employed. The realization of a set of universal descriptors would significantly streamline such chemoinformatics tasks as virtual high-throughout screening (VHTS) and toxicity profiling. The current study reports the derivation and validation of a potential set of universal descriptors, referred to as the 4D-fingerprints. The 4D-fingerprints are derived from the 4D-molecular similarity analysis. To evaluate the applicability of the 4D-fingerprints as universal descriptors, they are used to generate descriptive QSAR models for 5 independent training sets. Each of the training sets has been analyzed previously by several varying QSAR methods, and the results of the models generated using the 4D-fingerprints are compared to the results of the previous QSAR analyses. It was found that the models generated using the 4D-fingerprints are comparable in quality, based on statistical measures of fit and test set prediction, to the previously reported models for the other QSAR methods. This finding is particularly significant considering the 4D-fingerprints are generated independent of external constraints such as alignment, while the QSAR methods used for comparison all require an alignment analysis.
在化学信息学和分子建模领域,一个难以实现的目标是生成一组描述符,一旦为一个分子计算出来,就可以用于各种各样的应用。由于此类通用描述符的生成不受外部约束,它们本质上独立于使用它们的数据集。实现一组通用描述符将显著简化诸如虚拟高通量筛选(VHTS)和毒性分析等化学信息学任务。当前的研究报告了一组潜在的通用描述符(称为4D指纹)的推导和验证。4D指纹源自4D分子相似性分析。为了评估4D指纹作为通用描述符的适用性,将它们用于为5个独立的训练集生成描述性QSAR模型。每个训练集之前都已通过几种不同的QSAR方法进行了分析,并将使用4D指纹生成的模型结果与之前QSAR分析结果进行比较。结果发现,基于拟合和测试集预测的统计量度,使用4D指纹生成的模型在质量上与之前报道的其他QSAR方法的模型相当。考虑到4D指纹的生成独立于诸如比对等外部约束,而用于比较的QSAR方法都需要进行比对分析,这一发现尤为重要。