• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Ultrahigh dimensional feature selection: beyond the linear model.超高维特征选择:超越线性模型
J Mach Learn Res. 2009;10:2013-2038.
2
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
3
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
4
A selective overview of feature screening for ultrahigh-dimensional data.超高维数据特征筛选的选择性概述。
Sci China Math. 2015 Oct;58(10):2033-2054. doi: 10.1007/s11425-015-5062-9. Epub 2015 Aug 22.
5
Feature Screening in Ultrahigh Dimensional Cox's Model.超高维Cox模型中的特征筛选
Stat Sin. 2016;26:881-901. doi: 10.5705/ss.2014.171.
6
Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems.超高维监督问题中用于特征筛选的协变量信息数
J Am Stat Assoc. 2022;117(539):1516-1529. doi: 10.1080/01621459.2020.1864380. Epub 2021 Feb 10.
7
Feature screening for case-cohort studies with failure time outcome.具有生存时间结局的病例队列研究的特征筛选
Scand Stat Theory Appl. 2021 Mar;48(1):349-370. doi: 10.1111/sjos.12503. Epub 2020 Nov 16.
8
Model-Free Conditional Independence Feature Screening For Ultrahigh Dimensional Data.超高维数据的无模型条件独立特征筛选
Sci China Math. 2017 Mar;60(3):551-568. doi: 10.1007/s11425-016-0186-8. Epub 2016 Dec 29.
9
Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.具有超高维协变量的变系数模型的特征选择
J Am Stat Assoc. 2014 Jan 1;109(505):266-274. doi: 10.1080/01621459.2013.850086.
10
Feature Screening in Ultrahigh Dimensional Generalized Varying-coefficient Models.超高维广义变系数模型中的特征筛选
Stat Sin. 2020;30:1049-1067. doi: 10.5705/ss.202017.0362.

引用本文的文献

1
-KIDS: A Novel Feature Evaluation in the Ultrahigh-Dimensional Right-Censored Setting, With Application to Head and Neck Cancer.-KIDS:超高维删失数据中的一种新型特征评估方法及其在头颈癌中的应用
Stat Med. 2025 Jul;44(15-17):e70167. doi: 10.1002/sim.70167.
2
Sparse vertex discriminant analysis: Variable selection for biomedical classification applications.稀疏顶点判别分析:生物医学分类应用中的变量选择
Comput Stat Data Anal. 2025 Jun;206. doi: 10.1016/j.csda.2025.108125. Epub 2025 Jan 7.
3
Early Diagnosis of Bloodstream Infections Using Serum Metabolomic Analysis.利用血清代谢组学分析进行血流感染的早期诊断
Metabolites. 2024 Dec 6;14(12):685. doi: 10.3390/metabo14120685.
4
-KIDS: A novel feature evaluation in the ultrahigh-dimensional right-censored setting, with application to Head and Neck Cancer.-KIDS:超高维右删失数据中的一种新型特征评估方法及其在头颈癌中的应用
medRxiv. 2024 Aug 14:2024.08.13.24311946. doi: 10.1101/2024.08.13.24311946.
5
A Model-free Approach for Testing Association.一种用于检验关联性的无模型方法。
J R Stat Soc Ser C Appl Stat. 2021 Jun;70(3):511-531. doi: 10.1111/rssc.12467. Epub 2021 Jun 4.
6
Omics feature selection with the extended SIS R package: identification of a body mass index epigenetic multimarker in the Strong Heart Study.使用扩展的 SIS R 包进行组学特征选择:在 Strong Heart 研究中鉴定出体重指数的表观遗传多标记物。
Am J Epidemiol. 2024 Jul 8;193(7):1010-1018. doi: 10.1093/aje/kwae006.
7
Machine-learning methods based on the texture and non-texture features of MRI for the preoperative prediction of sentinel lymph node metastasis in breast cancer.基于MRI纹理和非纹理特征的机器学习方法用于乳腺癌前哨淋巴结转移的术前预测
Transl Cancer Res. 2023 Dec 31;12(12):3471-3485. doi: 10.21037/tcr-22-2534. Epub 2023 Dec 6.
8
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
9
A Model-free Variable Screening Method Based on Leverage Score.一种基于杠杆得分的无模型变量筛选方法。
J Am Stat Assoc. 2023;118(541):135-146. doi: 10.1080/01621459.2021.1918554. Epub 2021 Jun 21.
10
BatMan: Mitigating Batch Effects Via Stratification for Survival Outcome Prediction.BatMan:通过分层缓解批次效应以进行生存结局预测。
JCO Clin Cancer Inform. 2023 Jun;7:e2200138. doi: 10.1200/CCI.22.00138.

本文引用的文献

1
One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.非凹惩罚似然模型中的一步稀疏估计
Ann Stat. 2008 Aug 1;36(4):1509-1533. doi: 10.1214/009053607000000802.
2
Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.《超高维特征空间中的确定独立性筛选》讨论
J R Stat Soc Series B Stat Methodol. 2008 Nov;70(5):903. doi: 10.1111/j.1467-9868.2008.00674.x.
3
High Dimensional Classification Using Features Annealed Independence Rules.使用特征退火独立规则的高维分类
Ann Stat. 2008;36(6):2605-2637. doi: 10.1214/07-AOS504.
4
Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.基于定制寡核苷酸微阵列基因表达的神经母细胞瘤患者分类优于当前的临床风险分层。
J Clin Oncol. 2006 Nov 1;24(31):5070-8. doi: 10.1200/JCO.2006.06.1879.
5
Statistical analysis of DNA microarray data in cancer research.癌症研究中DNA微阵列数据的统计分析。
Clin Cancer Res. 2006 Aug 1;12(15):4469-73. doi: 10.1158/1078-0432.CCR-06-1033.
6
Optimally sparse representation in general (nonorthogonal) dictionaries via l minimization.通过 l 最小化实现一般(非正交)字典中的最优稀疏表示。
Proc Natl Acad Sci U S A. 2003 Mar 4;100(5):2197-202. doi: 10.1073/pnas.0437847100. Epub 2003 Feb 21.
7
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.利用基因表达谱和人工神经网络进行癌症的分类与诊断预测。
Nat Med. 2001 Jun;7(6):673-9. doi: 10.1038/89044.

超高维特征选择:超越线性模型

Ultrahigh dimensional feature selection: beyond the linear model.

作者信息

Fan Jianqing, Samworth Richard, Wu Yichao

机构信息

Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08540 USA.

出版信息

J Mach Learn Res. 2009;10:2013-2038.

PMID:21603590
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3095976/
Abstract

Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking (Fan and Lv, 2008) or feature selection using a two-sample t-test in high-dimensional classification (Tibshirani et al., 2003). Within the context of the linear model, Fan and Lv (2008) showed that this simple correlation ranking possesses a sure independence screening property under certain conditions and that its revision, called iteratively sure independent screening (ISIS), is needed when the features are marginally unrelated but jointly related to the response variable. In this paper, we extend ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case. Even in the least-squares setting, the new method improves ISIS by allowing feature deletion in the iterative process. Our technique allows us to select important features in high-dimensional classification where the popularly used two-sample t-method fails. A new technique is introduced to reduce the false selection rate in the feature screening stage. Several simulated and two real data examples are presented to illustrate the methodology.

摘要

高维空间中的变量选择是科学发现和决策中许多当代问题的特征。许多常用技术都基于独立性筛选;例如相关排序(范剑青和吕毅,2008年)或在高维分类中使用两样本t检验进行特征选择(蒂布希拉尼等人,2003年)。在线性模型的背景下,范剑青和吕毅(2008年)表明,这种简单的相关排序在某些条件下具有确定的独立性筛选属性,并且当特征与响应变量边际无关但联合相关时,需要对其进行修正,即所谓的迭代确定独立筛选(ISIS)。在本文中,我们将ISIS扩展到一个一般的伪似然框架,该框架以广义线性模型为特例,且无需明确定义残差。即使在最小二乘设置中,新方法也通过允许在迭代过程中删除特征来改进ISIS。我们的技术使我们能够在高维分类中选择重要特征,而常用的两样本t方法在这种情况下会失效。本文引入了一种新技术来降低特征筛选阶段的错误选择率。给出了几个模拟示例和两个实际数据示例来说明该方法。