Fang Jianwen, Tai David
Applied Bioinformatics Laboratory, the University of Kansas, Lawrence, 66047, USA.
Curr Drug Discov Technol. 2011 Jun;8(2):107-11. doi: 10.2174/157016311795563839.
Feature selection has become increasingly important for quantitative structure-activity relationship (QSAR) studies. In the present article, we evaluate three state-of-the-art feature selection algorithms, namely mutual information (MI), genetic algorithm (GA), and support vector machine regression (SVR)-based recursive feature elimination (SVR-RFE), in the reduction of high dimensional feature space for QSAR regression. We used SVR to evaluate the performance of these feature selection algorithms. In addition, we present a simple but very efficient iterative strategy for optimizing parameters for SVM-RFE algorithm. All three algorithms can effectively reduce the number of features and often achieve improved performance.
特征选择对于定量构效关系(QSAR)研究变得越来越重要。在本文中,我们评估了三种最先进的特征选择算法,即互信息(MI)、遗传算法(GA)和基于支持向量机回归(SVR)的递归特征消除(SVR-RFE),用于减少QSAR回归的高维特征空间。我们使用SVR来评估这些特征选择算法的性能。此外,我们提出了一种简单但非常有效的迭代策略来优化SVM-RFE算法的参数。所有这三种算法都可以有效地减少特征数量,并常常实现性能的提升。