Wang Jie, Du Hongying, Liu Huanxiang, Yao Xiaojun, Hu Zhide, Fan Botao
Department of Chemistry, Lanzhou University, Lanzhou, China.
Talanta. 2007 Aug 15;73(1):147-56. doi: 10.1016/j.talanta.2007.03.037. Epub 2007 Mar 24.
As a novel type of learning machine method a support vector machine (SVM) was first used to develop a quantitative structure-property relationship (QSPR) model for the latest surface tension data of common diversity liquid compounds. Each compound was represented by structural descriptors, which were calculated from the molecular structure by the CODESSA program. The heuristic method (HM) was used to search the descriptor space, select the descriptors responsible for surface tension, and give the best linear regression model using the selected descriptors. Using the same descriptors, the non-linear regression model was built based on the support vector machine. Comparing the results of the two methods, the non-linear regression model gave a better prediction result than the heuristic method. Some insights into the factors that were likely to govern the surface tension of the diversity compounds could be gained by interpreting the molecular descriptors, which were selected by the heuristic model. This paper proposes a new effective way of researching interface chemistry, and can be very helpful to industry.
作为一种新型的学习机方法,支持向量机(SVM)首次被用于为常见多样液体化合物的最新表面张力数据建立定量结构-性质关系(QSPR)模型。每种化合物由结构描述符表示,这些描述符通过CODESSA程序从分子结构计算得出。启发式方法(HM)用于搜索描述符空间,选择负责表面张力的描述符,并使用所选描述符给出最佳线性回归模型。使用相同的描述符,基于支持向量机构建了非线性回归模型。比较两种方法的结果,非线性回归模型比启发式方法给出了更好的预测结果。通过解释启发式模型选择的分子描述符,可以深入了解可能控制多样化合物表面张力的因素。本文提出了一种研究界面化学的新有效方法,对工业非常有帮助。