Suppr超能文献

高维变系数模型中的变量选择与估计

VARIABLE SELECTION AND ESTIMATION IN HIGH-DIMENSIONAL VARYING-COEFFICIENT MODELS.

作者信息

Wei Fengrong, Huang Jian, Li Hongzhe

机构信息

Department of Mathematics, University of West Georgia, 1601 Maple Street, Carrollton, GA 30118, USA.

出版信息

Stat Sin. 2011 Oct 1;21(4):1515-1540. doi: 10.5705/ss.2009.316.

Abstract

Nonparametric varying coefficient models are useful for studying the time-dependent effects of variables. Many procedures have been developed for estimation and variable selection in such models. However, existing work has focused on the case when the number of variables is fixed or smaller than the sample size. In this paper, we consider the problem of variable selection and estimation in varying coefficient models in sparse, high-dimensional settings when the number of variables can be larger than the sample size. We apply the group Lasso and basis function expansion to simultaneously select the important variables and estimate the nonzero varying coefficient functions. Under appropriate conditions, we show that the group Lasso selects a model of the right order of dimensionality, selects all variables with the norms of the corresponding coefficient functions greater than certain threshold level, and is estimation consistent. However, the group Lasso is in general not selection consistent and tends to select variables that are not important in the model. In order to improve the selection results, we apply the adaptive group Lasso. We show that, under suitable conditions, the adaptive group Lasso has the oracle selection property in the sense that it correctly selects important variables with probability converging to one. In contrast, the group Lasso does not possess such oracle property. Both approaches are evaluated using simulation and demonstrated on a data example.

摘要

非参数变系数模型对于研究变量的时间依存效应很有用。针对此类模型中的估计和变量选择,已经开发了许多方法。然而,现有工作主要集中在变量数量固定或小于样本量的情况。在本文中,我们考虑稀疏、高维情形下变系数模型的变量选择和估计问题,此时变量数量可能大于样本量。我们应用组套索和基函数展开来同时选择重要变量并估计非零变系数函数。在适当条件下,我们证明组套索能选择维度正确的模型,能选择所有对应系数函数范数大于特定阈值水平的变量,并且估计是一致的。然而,组套索一般不具有选择一致性,并且倾向于选择在模型中不重要的变量。为了改进选择结果,我们应用自适应组套索。我们证明,在合适条件下,自适应组套索具有似然选择性质,即它能以收敛到1的概率正确选择重要变量。相比之下,组套索不具有这种似然性质。两种方法都通过模拟进行评估,并在一个数据实例上进行了演示。

相似文献

2
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS.非参数加法模型中的变量选择
Ann Stat. 2010 Aug 1;38(4):2282-2313. doi: 10.1214/09-AOS781.
4
Consistent group selection in high-dimensional linear regression.高维线性回归中的一致组选择
Bernoulli (Andover). 2010 Nov;16(4):1369-1384. doi: 10.3150/10-BEJ252.
6
The lasso for high dimensional regression with a possible change point.具有可能变化点的高维回归套索法
J R Stat Soc Series B Stat Methodol. 2016 Jan;78(1):193-210. doi: 10.1111/rssb.12108. Epub 2015 Feb 15.
9

引用本文的文献

3
Time-varying coefficient model estimation through radial basis functions.通过径向基函数进行时变系数模型估计。
J Appl Stat. 2021 Apr 5;49(10):2510-2534. doi: 10.1080/02664763.2021.1910938. eCollection 2022.
9
Feature screening in ultrahigh-dimensional varying-coefficient Cox model.超高维变系数Cox模型中的特征筛选
J Multivar Anal. 2019 May;171:284-297. doi: 10.1016/j.jmva.2018.12.009. Epub 2018 Dec 28.
10
Feature screening in ultrahigh-dimensional additive Cox model.超高维加法Cox模型中的特征筛选
J Stat Comput Simul. 2018;88(6):1117-1133. doi: 10.1080/00949655.2017.1422127. Epub 2018 Jan 8.

本文引用的文献

3
5
Statistical methods for identifying yeast cell cycle transcription factors.鉴定酵母细胞周期转录因子的统计方法。
Proc Natl Acad Sci U S A. 2005 Sep 20;102(38):13532-7. doi: 10.1073/pnas.0505874102. Epub 2005 Sep 12.
8
Transcriptional regulatory networks in Saccharomyces cerevisiae.酿酒酵母中的转录调控网络。
Science. 2002 Oct 25;298(5594):799-804. doi: 10.1126/science.1075090.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验