Ohio University, Center for Intelligent Chemical Instrumentation, Department of Chemistry and Biochemistry, Clippinger Laboratories, Athens, OH 45701-2979, USA.
Anal Chim Acta. 2017 Feb 15;954:14-21. doi: 10.1016/j.aca.2016.11.072. Epub 2016 Dec 8.
The support vector machine (SVM) is a powerful classifier that has recently been implemented in a classification tree (SVMTreeG). This classifier partitioned the data by finding gaps in the data space. For large and complex datasets, there may be no gaps in the data space confounding this type of classifier. A novel algorithm was devised that uses fuzzy entropy to find optimal partitions for situations when clusters of data are overlapped in the data space. Also, a kernel version of the fuzzy entropy algorithm was devised. A fast support vector machine implementation is used that has no cost C or slack variables to optimize. Statistical comparisons using bootstrapped Latin partitions among the tree classifiers were made using a synthetic XOR data set and validated with ten prediction sets comprised of 50,000 objects and a data set of NMR spectra obtained from 12 tea sample extracts.
支持向量机(SVM)是一种强大的分类器,最近已在分类树(SVMTreeG)中实现。该分类器通过在数据空间中找到间隙来划分数据。对于大型和复杂的数据集,数据空间中可能没有间隙,这会使这种类型的分类器变得复杂。设计了一种新的算法,该算法使用模糊熵在数据空间中数据簇重叠的情况下找到最佳分区。还设计了模糊熵算法的核版本。使用快速支持向量机实现,该实现没有成本 C 或松弛变量来优化。使用合成 XOR 数据集进行了基于引导拉丁分区的树分类器的统计比较,并使用由 50,000 个对象组成的十个预测集和从 12 个茶样品提取物获得的 NMR 光谱数据集进行了验证。