Gutierrez Pedro Antonio, Hervas-Martinez César, Martinez-Estudillo Francisco J
Department of Computer Science and Numerical Analysis, University of Córdoba, Córdoba14004, Spain.
IEEE Trans Neural Netw. 2011 Feb;22(2):246-63. doi: 10.1109/TNN.2010.2093537. Epub 2010 Dec 6.
This paper proposes a hybrid multilogistic methodology, named logistic regression using initial and radial basis function (RBF) covariates. The process for obtaining the coefficients is carried out in three steps. First, an evolutionary programming (EP) algorithm is applied, in order to produce an RBF neural network (RBFNN) with a reduced number of RBF transformations and the simplest structure possible. Then, the initial attribute space (or, as commonly known as in logistic regression literature, the covariate space) is transformed by adding the nonlinear transformations of the input variables given by the RBFs of the best individual in the final generation. Finally, a maximum likelihood optimization method determines the coefficients associated with a multilogistic regression model built in this augmented covariate space. In this final step, two different multilogistic regression algorithms are applied: one considers all initial and RBF covariates (multilogistic initial-RBF regression) and the other one incrementally constructs the model and applies cross validation, resulting in an automatic covariate selection [simplelogistic initial-RBF regression (SLIRBF)]. Both methods include a regularization parameter, which has been also optimized. The methodology proposed is tested using 18 benchmark classification problems from well-known machine learning problems and two real agronomical problems. The results are compared with the corresponding multilogistic regression methods applied to the initial covariate space, to the RBFNNs obtained by the EP algorithm, and to other probabilistic classifiers, including different RBFNN design methods [e.g., relaxed variable kernel density estimation, support vector machines, a sparse classifier (sparse multinomial logistic regression)] and a procedure similar to SLIRBF but using product unit basis functions. The SLIRBF models are found to be competitive when compared with the corresponding multilogistic regression methods and the RBFEP method. A measure of statistical significance is used, which indicates that SLIRBF reaches the state of the art.
本文提出了一种混合多逻辑方法,即使用初始和径向基函数(RBF)协变量的逻辑回归。获取系数的过程分三步进行。首先,应用进化规划(EP)算法,以生成具有减少数量的RBF变换且结构尽可能简单的RBF神经网络(RBFNN)。然后,通过添加由最终代中最佳个体的RBF给出的输入变量的非线性变换,对初始属性空间(或者,在逻辑回归文献中通常称为协变量空间)进行变换。最后,一种最大似然优化方法确定与在这个扩充协变量空间中构建的多逻辑回归模型相关的系数。在这最后一步中,应用了两种不同的多逻辑回归算法:一种考虑所有初始和RBF协变量(多逻辑初始 - RBF回归),另一种逐步构建模型并应用交叉验证,从而实现自动协变量选择[简单逻辑初始 - RBF回归(SLIRBF)]。两种方法都包括一个正则化参数,该参数也已进行了优化。所提出的方法使用来自著名机器学习问题的18个基准分类问题和两个实际农艺问题进行了测试。将结果与应用于初始协变量空间的相应多逻辑回归方法、通过EP算法获得的RBFNN以及其他概率分类器进行了比较,其他概率分类器包括不同的RBFNN设计方法[例如,松弛变量核密度估计、支持向量机、一种稀疏分类器(稀疏多项逻辑回归)]以及一种类似于SLIRBF但使用乘积单元基函数的过程。结果发现,与相应的多逻辑回归方法和RBFEP方法相比,SLIRBF模型具有竞争力。使用了一种统计显著性度量,结果表明SLIRBF达到了当前的先进水平。