• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种具有重叠分区结构的超高维特征筛选方法。

A screening method for ultra-high dimensional features with overlapped partition structures.

机构信息

Department of Biostatistics, School of Public Health, 33133Peking University Health Science Center, Beijing, China.

Beijing International Center for Mathematical Research, 12465Peking University, Beijing, China.

出版信息

Stat Methods Med Res. 2023 Jan;32(1):22-40. doi: 10.1177/09622802221129043. Epub 2022 Sep 29.

DOI:10.1177/09622802221129043
PMID:36177601
Abstract

Ultra-high dimensional data, such as gene and neuroimaging data, are becoming increasingly important in biomedical science. Identifying important biomarkers from the huge number of features can help us gain better insights into further researches. Variable screening is an efficient tool to achieve this goal under the large scale cases, which reduces the dimension of features into a moderate size by removing the major part of inactive ones. Developing novel variable screening methods for high-dimensional features with group structures is challenging, especially under the overlapped cases. For example, the huge-scaled genes usually can be partitioned into hundreds of pathways according to background knowledge. One primary characteristic for this type of data is that many genes may appear across more than one pathway, which means that different pathways are overlapped. However, existing variable screening methods only could deal with disjoint group structure cases. To fill this gap, we propose a novel variable screening method for the generalized linear model by incorporating overlapped partition structures with theoretical guarantee. Besides the sure screening property, we also test the performance of the proposed method through a series of numerical studies and apply it to statistical analysis of a breast cancer data.

摘要

超高维数据,如基因和神经影像学数据,在生物医学科学中变得越来越重要。从大量特征中识别重要的生物标志物可以帮助我们更好地深入研究。变量筛选是在大规模情况下实现这一目标的有效工具,它通过去除主要的非活性部分将特征的维度降低到适中的大小。开发具有组结构的高维特征的新型变量筛选方法具有挑战性,特别是在重叠情况下。例如,根据背景知识,庞大的基因通常可以分为数百个途径。这类数据的一个主要特征是,许多基因可能出现在不止一个途径中,这意味着不同的途径是重叠的。然而,现有的变量筛选方法只能处理不相交的分组结构情况。为了填补这一空白,我们提出了一种新的广义线性模型变量筛选方法,通过理论保证将重叠分区结构纳入其中。除了可靠的筛选特性外,我们还通过一系列数值研究来测试所提出方法的性能,并将其应用于乳腺癌数据的统计分析。

相似文献

1
A screening method for ultra-high dimensional features with overlapped partition structures.一种具有重叠分区结构的超高维特征筛选方法。
Stat Methods Med Res. 2023 Jan;32(1):22-40. doi: 10.1177/09622802221129043. Epub 2022 Sep 29.
2
Partition-based ultrahigh-dimensional variable screening.基于划分的超高维变量筛选
Biometrika. 2017 Nov;104(4):785-800. doi: 10.1093/biomet/asx052. Epub 2017 Oct 9.
3
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.广义线性模型中高维变量选择的特征筛选
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
4
The Sparse MLE for Ultra-High-Dimensional Feature Screening.超高维特征筛选的稀疏极大似然估计
J Am Stat Assoc. 2014;109(507):1257-1269. doi: 10.1080/01621459.2013.879531.
5
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
6
Principled sure independence screening for Cox models with ultra-high-dimensional covariates.具有超高维协变量的Cox模型的有原则的确定性独立筛选
J Multivar Anal. 2012 Feb 1;105(1):397-411. doi: 10.1016/j.jmva.2011.08.002.
7
Combined Performance of Screening and Variable Selection Methods in Ultra-High Dimensional Data in Predicting Time-To-Event Outcomes.超高维数据中筛选和变量选择方法在预测事件发生时间结局方面的综合性能
Diagn Progn Res. 2018;2. doi: 10.1186/s41512-018-0043-4. Epub 2018 Sep 26.
8
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
9
Variable Selection for Sparse High-Dimensional Nonlinear Regression Models by Combining Nonnegative Garrote and Sure Independence Screening.结合非负Garrote和确定独立筛选法的稀疏高维非线性回归模型的变量选择
Stat Sin. 2014 Jul;24(3):1365-1387. doi: 10.5705/ss.2012.316.
10
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.稀疏超高维加法模型中的非参数独立性筛选
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.