• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种针对超高维数据的稳健变量筛选方法。

A robust variable screening procedure for ultra-high dimensional data.

作者信息

Ghosh Abhik, Thoresen Magne

机构信息

Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, India.

Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway.

出版信息

Stat Methods Med Res. 2021 Aug;30(8):1816-1832. doi: 10.1177/09622802211017299. Epub 2021 May 30.

DOI:10.1177/09622802211017299
PMID:34053339
Abstract

Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre-screening of the variables may be necessary. A number of procedures for such pre-screening has been developed; among them the Sure Independence Screening (SIS) enjoys some popularity. However, SIS is vulnerable to outliers in the data, and in particular in small samples this may lead to faulty inference. In this paper, we develop a new robust screening procedure. We build on the density power divergence (DPD) estimation approach and introduce DPD-SIS and its extension iterative DPD-SIS. We illustrate the behavior of the methods through extensive simulation studies and show that they are superior to both the original SIS and other robust methods when there are outliers in the data. Finally, we illustrate its use in a study on regulation of lipid metabolism.

摘要

超高维回归问题中的变量选择已成为一个重要问题。在这种情况下,惩罚回归模型可能会面临计算问题,因此可能需要对变量进行一些预筛选。已经开发了许多用于这种预筛选的程序;其中,确定性独立筛选(SIS)颇受青睐。然而,SIS 容易受到数据中异常值的影响,特别是在小样本中,这可能导致错误的推断。在本文中,我们开发了一种新的稳健筛选程序。我们基于密度功率散度(DPD)估计方法,引入了 DPD-SIS 及其扩展的迭代 DPD-SIS。我们通过广泛的模拟研究说明了这些方法的性能,并表明当数据中存在异常值时,它们优于原始的 SIS 和其他稳健方法。最后,我们说明了其在脂质代谢调节研究中的应用。

相似文献

1
A robust variable screening procedure for ultra-high dimensional data.一种针对超高维数据的稳健变量筛选方法。
Stat Methods Med Res. 2021 Aug;30(8):1816-1832. doi: 10.1177/09622802211017299. Epub 2021 May 30.
2
Combined Performance of Screening and Variable Selection Methods in Ultra-High Dimensional Data in Predicting Time-To-Event Outcomes.超高维数据中筛选和变量选择方法在预测事件发生时间结局方面的综合性能
Diagn Progn Res. 2018;2. doi: 10.1186/s41512-018-0043-4. Epub 2018 Sep 26.
3
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
4
A Generic Sure Independence Screening Procedure.一种通用的确定独立筛选程序。
J Am Stat Assoc. 2019;114(526):928-937. doi: 10.1080/01621459.2018.1462709. Epub 2018 Aug 6.
5
The Sparse MLE for Ultra-High-Dimensional Feature Screening.超高维特征筛选的稀疏极大似然估计
J Am Stat Assoc. 2014;109(507):1257-1269. doi: 10.1080/01621459.2013.879531.
6
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
7
Ultra-high dimensional variable selection for doubly robust causal inference.超高维变量选择在双重稳健因果推断中的应用。
Biometrics. 2023 Jun;79(2):903-914. doi: 10.1111/biom.13625. Epub 2022 Mar 22.
8
PREDICTION OF TREATMENT OUTCOME FOR AUTISM FROM STRUCTURE OF THE BRAIN BASED ON SURE INDEPENDENCE SCREENING.基于确信独立筛选法从大脑结构预测自闭症的治疗结果
Proc IEEE Int Symp Biomed Imaging. 2019 Apr;2019:404-408. doi: 10.1109/ISBI.2019.8759156. Epub 2019 Jul 11.
9
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.稀疏超高维加法模型中的非参数独立性筛选
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.
10
Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis.超高维判别分析的无模型特征筛选
J Am Stat Assoc. 2015 Jun 1;110(510):630-641. doi: 10.1080/01621459.2014.920256.

引用本文的文献

1
Identifying and overcoming COVID-19 vaccination impediments using Bayesian data mining techniques.利用贝叶斯数据分析技术识别和克服 COVID-19 疫苗接种障碍。
Sci Rep. 2024 Apr 13;14(1):8595. doi: 10.1038/s41598-024-58902-1.
2
Design of feature selection algorithm for high-dimensional network data based on supervised discriminant projection.基于监督判别投影的高维网络数据特征选择算法设计
PeerJ Comput Sci. 2023 Jun 26;9:e1447. doi: 10.7717/peerj-cs.1447. eCollection 2023.