• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过距离相关学习进行特征筛选

Feature Screening via Distance Correlation Learning.

作者信息

Li Runze, Zhong Wei, Zhu Liping

机构信息

The Pennsylvania State University, Xiamen University & Shanghai University of Finance and Economics.

出版信息

J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.

DOI:10.1080/01621459.2012.695654
PMID:25249709
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4170057/
Abstract

This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS, for short). The DC-SIS can be implemented as easily as the sure independence screening procedure based on the Pearson correlation (SIS, for short) proposed by Fan and Lv (2008). However, the DC-SIS can significantly improve the SIS. Fan and Lv (2008) established the sure screening property for the SIS based on linear models, but the sure screening property is valid for the DC-SIS under more general settings including linear models. Furthermore, the implementation of the DC-SIS does not require model specification (e.g., linear model or generalized linear model) for responses or predictors. This is a very appealing property in ultrahigh dimensional data analysis. Moreover, the DC-SIS can be used directly to screen grouped predictor variables and for multivariate response variables. We establish the sure screening property for the DC-SIS, and conduct simulations to examine its finite sample performance. Numerical comparison indicates that the DC-SIS performs much better than the SIS in various models. We also illustrate the DC-SIS through a real data example.

摘要

本文关注超高维数据分析中的筛选特征,这在多个科学领域中变得越来越重要。我们基于距离相关系数开发了一种确定独立性筛选程序(简称为DC-SIS)。DC-SIS的实施与Fan和Lv(2008)提出的基于Pearson相关系数的确定独立性筛选程序(简称为SIS)一样容易。然而,DC-SIS能显著改进SIS。Fan和Lv(2008)基于线性模型建立了SIS的确定筛选性质,但在包括线性模型在内的更一般设定下,确定筛选性质对DC-SIS也成立。此外,DC-SIS的实施不需要对响应变量或预测变量进行模型设定(例如线性模型或广义线性模型)。这在超高维数据分析中是一个非常吸引人的性质。而且,DC-SIS可直接用于筛选分组预测变量以及处理多变量响应变量。我们建立了DC-SIS的确定筛选性质,并进行模拟以检验其有限样本性能。数值比较表明,在各种模型中DC-SIS的表现都比SIS好得多。我们还通过一个实际数据例子来说明DC-SIS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee44/4170057/b7f563ce7c8e/nihms382822f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee44/4170057/b7f563ce7c8e/nihms382822f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee44/4170057/b7f563ce7c8e/nihms382822f1.jpg

相似文献

1
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
2
Feature Screening in Ultrahigh Dimensional Cox's Model.超高维Cox模型中的特征筛选
Stat Sin. 2016;26:881-901. doi: 10.5705/ss.2014.171.
3
Model-Free Conditional Independence Feature Screening For Ultrahigh Dimensional Data.超高维数据的无模型条件独立特征筛选
Sci China Math. 2017 Mar;60(3):551-568. doi: 10.1007/s11425-016-0186-8. Epub 2016 Dec 29.
4
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
5
Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis.超高维判别分析的无模型特征筛选
J Am Stat Assoc. 2015 Jun 1;110(510):630-641. doi: 10.1080/01621459.2014.920256.
6
Censored Rank Independence Screening for High-dimensional Survival Data.高维生存数据的删失秩独立性筛选
Biometrika. 2014;101(4):799-814. doi: 10.1093/biomet/asu047.
7
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.稀疏超高维加法模型中的非参数独立性筛选
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.
8
Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates.具有超高维协变量的变系数模型的特征选择
J Am Stat Assoc. 2014 Jan 1;109(505):266-274. doi: 10.1080/01621459.2013.850086.
9
Ultrahigh dimensional feature selection: beyond the linear model.超高维特征选择:超越线性模型
J Mach Learn Res. 2009;10:2013-2038.
10
Feature Screening in Ultrahigh Dimensional Generalized Varying-coefficient Models.超高维广义变系数模型中的特征筛选
Stat Sin. 2020;30:1049-1067. doi: 10.5705/ss.202017.0362.

引用本文的文献

1
Detection of LUAD-Associated Genes Using Wasserstein Distance in Multiomics Feature Selection.在多组学特征选择中使用 Wasserstein 距离检测肺腺癌相关基因
Bioengineering (Basel). 2025 Jun 25;12(7):694. doi: 10.3390/bioengineering12070694.
2
-KIDS: A Novel Feature Evaluation in the Ultrahigh-Dimensional Right-Censored Setting, With Application to Head and Neck Cancer.-KIDS:超高维删失数据中的一种新型特征评估方法及其在头颈癌中的应用
Stat Med. 2025 Jul;44(15-17):e70167. doi: 10.1002/sim.70167.
3
A Multi-Omics Framework for Survival Mediation Analysis of High-Dimensional Proteogenomic Data.

本文引用的文献

1
A regularized Hotelling's test for pathway analysis in proteomic studies.蛋白质组学研究中用于通路分析的正则化霍特林检验。
J Am Stat Assoc. 2011 Dec;106(496):1345-1360. doi: 10.1198/jasa.2011.ap10599.
2
Model-Free Feature Screening for Ultrahigh Dimensional Data.超高维数据的无模型特征筛选
J Am Stat Assoc. 2011 Jan 1;106(496):1464-1475. doi: 10.1198/jasa.2011.tm10563. Epub 2012 Jan 24.
3
Principled sure independence screening for Cox models with ultra-high-dimensional covariates.具有超高维协变量的Cox模型的有原则的确定性独立筛选
一种用于高维蛋白质基因组数据生存中介分析的多组学框架。
ArXiv. 2025 Mar 11:arXiv:2503.08606v1.
4
Uncertainty Quantification in Epigenetic Clocks via Conformalized Quantile Regression.通过共形分位数回归进行表观遗传时钟中的不确定性量化
Genet Epidemiol. 2025 Jun;49(4):e70008. doi: 10.1002/gepi.70008.
5
Optimizing Quality Tolerance Limits Monitoring in Clinical Trials Through Machine Learning Methods.通过机器学习方法优化临床试验中的质量公差限度监测
Ther Innov Regul Sci. 2025 May;59(3):566-578. doi: 10.1007/s43441-025-00754-6. Epub 2025 Feb 25.
6
Universally Consistent K-Sample Tests via Dependence Measures.通过依赖度量实现的通用一致K样本检验
Stat Probab Lett. 2025 Jan;216. doi: 10.1016/j.spl.2024.110278. Epub 2024 Sep 19.
7
Uncertainty quantification in epigenetic clocks via conformalized quantile regression.通过共形分位数回归进行表观遗传时钟中的不确定性量化。
medRxiv. 2025 Feb 11:2024.09.06.24313192. doi: 10.1101/2024.09.06.24313192.
8
-KIDS: A novel feature evaluation in the ultrahigh-dimensional right-censored setting, with application to Head and Neck Cancer.-KIDS:超高维右删失数据中的一种新型特征评估方法及其在头颈癌中的应用
medRxiv. 2024 Aug 14:2024.08.13.24311946. doi: 10.1101/2024.08.13.24311946.
9
A model-free and distribution-free multi-omics integration approach for detecting novel lung adenocarcinoma genes.一种无模型和无分布的多组学整合方法,用于检测新型肺腺癌基因。
Sci Rep. 2024 Aug 3;14(1):17996. doi: 10.1038/s41598-023-45813-w.
10
A Model-free Approach for Testing Association.一种用于检验关联性的无模型方法。
J R Stat Soc Ser C Appl Stat. 2021 Jun;70(3):511-531. doi: 10.1111/rssc.12467. Epub 2021 Jun 4.
J Multivar Anal. 2012 Feb 1;105(1):397-411. doi: 10.1016/j.jmva.2011.08.002.
4
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.稀疏超高维加法模型中的非参数独立性筛选
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.
5
Ultrahigh dimensional feature selection: beyond the linear model.超高维特征选择:超越线性模型
J Mach Learn Res. 2009;10:2013-2038.
6
On Brownian Distance Covariance and High Dimensional Data.关于布朗距离协方差与高维数据
Ann Appl Stat. 2009 Jan 1;3(4):1266-1269. doi: 10.1214/09-AOAS312.
7
ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS.关于具有发散参数数量的自适应弹性网络
Ann Stat. 2009;37(4):1733-1751. doi: 10.1214/08-AOS625.
8
One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.非凹惩罚似然模型中的一步稀疏估计
Ann Stat. 2008 Aug 1;36(4):1509-1533. doi: 10.1214/009053607000000802.
9
Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.《超高维特征空间中的确定独立性筛选》讨论
J R Stat Soc Series B Stat Methodol. 2008 Nov;70(5):903. doi: 10.1111/j.1467-9868.2008.00674.x.
10
Core signaling pathways in human pancreatic cancers revealed by global genomic analyses.通过全基因组分析揭示的人类胰腺癌核心信号通路。
Science. 2008 Sep 26;321(5897):1801-6. doi: 10.1126/science.1164368. Epub 2008 Sep 4.