• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于累积散度的无模型前向筛选

MODEL-FREE FORWARD SCREENING VIA CUMULATIVE DIVERGENCE.

作者信息

Zhou Tingyou, Zhu Liping, Xu Chen, Li Runze

机构信息

School of Data Sciences, Zhejiang University of Finance and Economics, Hangzhou, P. R. China.

Institute of Statistics and Big Data and Center for Applied Statistics, Renmin University of China, Beijing, P. R. China.

出版信息

J Am Stat Assoc. 2020;115(531):1393-1405. doi: 10.1080/01621459.2019.1632078. Epub 2019 Jul 22.

DOI:10.1080/01621459.2019.1632078
PMID:33487782
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7821979/
Abstract

Feature screening plays an important role in the analysis of ultrahigh dimensional data. Due to complicated model structure and high noise level, existing screening methods often suffer from model misspecification and the presence of outliers. To address these issues, we introduce a new metric named cumulative divergence (CD), and develop a CD-based forward screening procedure. This forward screening method is model-free and resistant to the presence of outliers in the response. It also incorporates the joint effects among covariates into the screening process. With a data-driven threshold, the new method can automatically determine the number of features that should be retained after screening. These merits make the CD-based screening very appealing in practice. Under certain regularity conditions, we show that the proposed method possesses sure screening property. The performance of our proposal is illustrated through simulations and a real data example.

摘要

特征筛选在超高维数据的分析中起着重要作用。由于模型结构复杂且噪声水平高,现有的筛选方法常常受到模型误设和异常值存在的困扰。为了解决这些问题,我们引入了一种名为累积散度(CD)的新度量,并开发了一种基于CD的前向筛选程序。这种前向筛选方法是无模型的,并且对响应中的异常值具有抗性。它还将协变量之间的联合效应纳入筛选过程。通过一个数据驱动的阈值,新方法可以自动确定筛选后应保留的特征数量。这些优点使得基于CD的筛选在实际应用中非常有吸引力。在一定的正则性条件下,我们证明了所提出的方法具有确定筛选性质。通过模拟和一个实际数据例子说明了我们方法的性能。

相似文献

1
MODEL-FREE FORWARD SCREENING VIA CUMULATIVE DIVERGENCE.基于累积散度的无模型前向筛选
J Am Stat Assoc. 2020;115(531):1393-1405. doi: 10.1080/01621459.2019.1632078. Epub 2019 Jul 22.
2
Model-Free Conditional Independence Feature Screening For Ultrahigh Dimensional Data.超高维数据的无模型条件独立特征筛选
Sci China Math. 2017 Mar;60(3):551-568. doi: 10.1007/s11425-016-0186-8. Epub 2016 Dec 29.
3
A Robust Model-Free Feature Screening Method for Ultrahigh-Dimensional Data.一种用于超高维数据的稳健无模型特征筛选方法。
J Comput Graph Stat. 2017;26(4):803-813. doi: 10.1080/10618600.2017.1328364. Epub 2017 Oct 9.
4
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
5
Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis.超高维判别分析的无模型特征筛选
J Am Stat Assoc. 2015 Jun 1;110(510):630-641. doi: 10.1080/01621459.2014.920256.
6
Group Feature Screening via the F Statistic.通过F统计量进行组特征筛选。
Commun Stat Simul Comput. 2022;51(4):1921-1931. doi: 10.1080/03610918.2019.1691223. Epub 2019 Nov 26.
7
Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.具有超高维协变量的变系数模型的特征选择
J Am Stat Assoc. 2014 Jan 1;109(505):266-274. doi: 10.1080/01621459.2013.850086.
8
Feature Screening in Ultrahigh Dimensional Cox's Model.超高维Cox模型中的特征筛选
Stat Sin. 2016;26:881-901. doi: 10.5705/ss.2014.171.
9
Feature Screening in Ultrahigh Dimensional Generalized Varying-coefficient Models.超高维广义变系数模型中的特征筛选
Stat Sin. 2020;30:1049-1067. doi: 10.5705/ss.202017.0362.
10
Regularized Quantile Regression and Robust Feature Screening for Single Index Models.单指标模型的正则化分位数回归与稳健特征筛选
Stat Sin. 2016 Jan;26(1):69-95. doi: 10.5705/ss.2014.049.

引用本文的文献

1
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
2
A Model-free Variable Screening Method Based on Leverage Score.一种基于杠杆得分的无模型变量筛选方法。
J Am Stat Assoc. 2023;118(541):135-146. doi: 10.1080/01621459.2021.1918554. Epub 2021 Jun 21.
3
Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery.通过控制错误发现率进行分位数自适应充分变量筛选

本文引用的文献

1
Variable screening via quantile partial correlation.通过分位数偏相关进行变量筛选。
J Am Stat Assoc. 2017;112(518):650-663. doi: 10.1080/01621459.2016.1156545. Epub 2017 Mar 30.
2
Conditional Sure Independence Screening.条件确定独立性筛选
J Am Stat Assoc. 2016;111(515):1266-1277. doi: 10.1080/01621459.2015.1092974. Epub 2016 Oct 18.
3
On Varying-coefficient Independence Screening for High-dimensional Varying-coefficient Models.关于高维变系数模型的变系数独立筛选
Entropy (Basel). 2023 Mar 17;25(3):524. doi: 10.3390/e25030524.
Stat Sin. 2014;24(4):1735-1752.
4
The Sparse MLE for Ultra-High-Dimensional Feature Screening.超高维特征筛选的稀疏极大似然估计
J Am Stat Assoc. 2014;109(507):1257-1269. doi: 10.1080/01621459.2013.879531.
5
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
6
MARGINAL EMPIRICAL LIKELIHOOD AND SURE INDEPENDENCE FEATURE SCREENING.边际经验似然与确定独立性特征筛选
Ann Stat. 2013 Aug 1;41(4). doi: 10.1214/13-AOS1139.
7
Model-Free Feature Screening for Ultrahigh Dimensional Data.超高维数据的无模型特征筛选
J Am Stat Assoc. 2011 Jan 1;106(496):1464-1475. doi: 10.1198/jasa.2011.tm10563. Epub 2012 Jan 24.
8
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.稀疏超高维加法模型中的非参数独立性筛选
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.
9
Ultrahigh dimensional feature selection: beyond the linear model.超高维特征选择:超越线性模型
J Mach Learn Res. 2009;10:2013-2038.
10
Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.《超高维特征空间中的确定独立性筛选》讨论
J R Stat Soc Series B Stat Methodol. 2008 Nov;70(5):903. doi: 10.1111/j.1467-9868.2008.00674.x.