Baumes L A, Serra J M, Serna P, Corma A
Instituto de Tecnología Química (UPV-CSIC), av. Naranjos s/n, 46022 Valencia, Spain.
J Comb Chem. 2006 Jul-Aug;8(4):583-96. doi: 10.1021/cc050093m.
This works provides an introduction to support vector machines (SVMs) for predictive modeling in heterogeneous catalysis, describing step by step the methodology with a highlighting of the points which make such technique an attractive approach. We first investigate linear SVMs, working in detail through a simple example based on experimental data derived from a study aiming at optimizing olefin epoxidation catalysts applying high-throughput experimentation. This case study has been chosen to underline SVM features in a visual manner because of the few catalytic variables investigated. It is shown how SVMs transform original data into another representation space of higher dimensionality. The concepts of Vapnik-Chervonenkis dimension and structural risk minimization are introduced. The SVM methodology is evaluated with a second catalytic application, that is, light paraffin isomerization. Finally, we discuss why SVMs is a strategic method, as compared to other machine learning techniques, such as neural networks or induction trees, and why emphasis is put on the problem of overfitting.
本文介绍了用于多相催化预测建模的支持向量机(SVM),逐步描述了该方法,并突出了使其成为一种有吸引力的方法的要点。我们首先研究线性支持向量机,通过一个基于从旨在应用高通量实验优化烯烃环氧化催化剂的研究中获得的实验数据的简单示例进行详细分析。选择这个案例研究是为了以直观的方式强调支持向量机的特征,因为所研究的催化变量较少。展示了支持向量机如何将原始数据转换到更高维度的另一个表示空间。引入了Vapnik-Chervonenkis维度和结构风险最小化的概念。支持向量机方法在第二个催化应用即轻质石蜡异构化中进行了评估。最后,我们讨论了与其他机器学习技术(如神经网络或归纳树)相比,为什么支持向量机是一种战略性方法,以及为什么强调过拟合问题。