School of Science, Xi'an University of Architecture and Technology, Xi'an, Shaanxi 710055, China.
School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.
Comput Intell Neurosci. 2017;2017:2747431. doi: 10.1155/2017/2747431. Epub 2017 Nov 15.
As a pivotal tool to build interpretive models, variable selection plays an increasingly important role in high-dimensional data analysis. In recent years, variable selection ensembles (VSEs) have gained much interest due to their many advantages. Stability selection (Meinshausen and Bühlmann, 2010), a VSE technique based on subsampling in combination with a base algorithm like lasso, is an effective method to control false discovery rate (FDR) and to improve selection accuracy in linear regression models. By adopting lasso as a base learner, we attempt to extend stability selection to handle variable selection problems in a Cox model. According to our experience, it is crucial to set the regularization region Λ in lasso and the parameter properly so that stability selection can work well. To the best of our knowledge, however, there is no literature addressing this problem in an explicit way. Therefore, we first provide a detailed procedure to specify Λ and . Then, some simulated and real-world data with various censoring rates are used to examine how well stability selection performs. It is also compared with several other variable selection approaches. Experimental results demonstrate that it achieves better or competitive performance in comparison with several other popular techniques.
作为构建解释模型的关键工具,变量选择在高维数据分析中发挥着越来越重要的作用。近年来,基于抽样的变量选择集成(Variable Selection Ensembles,VSE)技术由于其诸多优势而备受关注。稳定性选择(Stability Selection)是一种基于抽样的 VSE 技术,结合了诸如 lasso 之类的基础算法,是控制错误发现率(False Discovery Rate,FDR)和提高线性回归模型选择准确性的有效方法。通过采用 lasso 作为基础学习器,我们尝试将稳定性选择扩展到 Cox 模型中的变量选择问题。根据我们的经验,正确设置 lasso 中的正则化区域Λ和参数 至关重要,以便稳定性选择能够良好地工作。然而,据我们所知,目前还没有文献以明确的方式解决这个问题。因此,我们首先提供了一个详细的过程来指定 Λ 和 。然后,使用各种截尾率的模拟和真实数据来检查稳定性选择的性能如何。还将其与其他几种变量选择方法进行了比较。实验结果表明,与其他几种流行技术相比,它具有更好或相当的性能。