Ocloo Isaac Xoese, Chen Hanfeng
Department of Statistics, University of Georgia, Athens, GA 30602, USA.
Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA.
Entropy (Basel). 2022 Dec 21;25(1):14. doi: 10.3390/e25010014.
In this paper, the LASSO method with extended Bayesian information criteria (EBIC) for feature selection in high-dimensional models is studied. We propose the use of the energy distance correlation in place of the ordinary correlation coefficient to measure the dependence of two variables. The energy distance correlation detects linear and non-linear association between two variables, unlike the ordinary correlation coefficient, which detects only linear association. EBIC is adopted as the stopping criterion. It is shown that the new method is more powerful than Luo and Chen's method for feature selection. This is demonstrated by simulation studies and illustrated by a real-life example. It is also proved that the new algorithm is selection-consistent.
本文研究了用于高维模型特征选择的带有扩展贝叶斯信息准则(EBIC)的套索(LASSO)方法。我们建议使用能量距离相关性来代替普通相关系数,以衡量两个变量之间的依赖性。与仅检测线性关联的普通相关系数不同,能量距离相关性可检测两个变量之间的线性和非线性关联。采用EBIC作为停止准则。结果表明,新方法在特征选择方面比罗和陈的方法更有效。这通过模拟研究得到了证明,并通过一个实际例子进行了说明。还证明了新算法是选择一致的。