Suppr超能文献

基于遗传算法-偏最小二乘法和支持向量机预测氨基酸的等电点

Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs.

作者信息

Liu H X, Zhang R S, Yao X J, Liu M C, Hu Z D, Fan B T

机构信息

Department of Chemistry, Lanzhou University, Lanzhou 730000, China.

出版信息

J Chem Inf Comput Sci. 2004 Jan-Feb;44(1):161-7. doi: 10.1021/ci034173u.

Abstract

The support vector machine (SVM), as a novel type of a learning machine, for the first time, was used to develop a QSPR model that relates the structures of 35 amino acids to their isoelectric point. Molecular descriptors calculated from the structure alone were used to represent molecular structures. The seven descriptors selected using GA-PLS, which is a sophisticated hybrid approach that combines GA as a powerful optimization method with PLS as a robust statistical method for variable selection, were used as inputs of RBFNNs and SVM to predict the isoelectric point of an amino acid. The optimal QSPR model developed was based on support vector machines, which showed the following results: the root-mean-square error of 0.2383 and the prediction correlation coefficient R=0.9702 were obtained for the whole data set. Satisfactory results indicated that the GA-PLS approach is a very effective method for variable selection, and the support vector machine is a very promising tool for the nonlinear approximation.

摘要

支持向量机(SVM)作为一种新型学习机,首次被用于建立一个将35种氨基酸的结构与其等电点相关联的定量构效关系(QSPR)模型。仅从结构计算得到的分子描述符被用于表示分子结构。使用遗传算法-偏最小二乘法(GA-PLS)选择的七个描述符作为径向基函数神经网络(RBFNNs)和支持向量机的输入来预测氨基酸的等电点,GA-PLS是一种复杂的混合方法,它将作为强大优化方法的遗传算法与作为稳健变量选择统计方法的偏最小二乘法相结合。所建立的最优QSPR模型基于支持向量机,结果如下:整个数据集的均方根误差为0.2383,预测相关系数R = 0.9702。令人满意的结果表明,GA-PLS方法是一种非常有效的变量选择方法,支持向量机是一种非常有前途的非线性逼近工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验