Suppr超能文献

通过CV熵滤波器进行统一的无模型相互作用筛选。

Unified model-free interaction screening via CV-entropy filter.

作者信息

Xiong Wei, Chen Yaxian, Ma Shuangge

机构信息

School of Statistics, University of International Business and Economics, Beijing 100872, PR China.

Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong.

出版信息

Comput Stat Data Anal. 2023 Apr;180. doi: 10.1016/j.csda.2022.107684. Epub 2022 Dec 28.

Abstract

For many practical high-dimensional problems, interactions have been increasingly found to play important roles beyond main effects. A representative example is gene-gene interaction. Joint analysis, which analyzes all interactions and main effects in a single model, can be seriously challenged by high dimensionality. For high-dimensional data analysis in general, marginal screening has been established as effective for reducing computational cost, increasing stability, and improving estimation/selection performance. Most of the existing marginal screening methods are designed for the analysis of main effects only. The existing screening methods for interaction analysis are often limited by making stringent model assumptions, lacking robustness, and/or requiring predictors to be continuous (and hence lacking flexibility). A unified marginal screening approach tailored to interaction analysis is developed, which can be applied to regression, classification, and survival analysis. Predictors are allowed to be continuous and discrete. The proposed approach is built on Coefficient of Variation (CV) filters based on information entropy. Statistical properties are rigorously established. It is shown that the CV filters are almost insensitive to the distribution tails of predictors, correlation structure among predictors, and sparsity level of signals. An efficient two-stage algorithm is developed to make the proposed approach scalable to ultrahigh-dimensional data. Simulations and the analysis of TCGA LUAD data further establish the practical superiority of the proposed approach.

摘要

对于许多实际的高维问题,人们越来越发现交互作用在主效应之外起着重要作用。一个典型的例子是基因-基因相互作用。联合分析在单个模型中分析所有交互作用和主效应,可能会受到高维性的严重挑战。一般来说,对于高维数据分析,边际筛选已被证明是有效的,它可以降低计算成本、提高稳定性并改善估计/选择性能。现有的大多数边际筛选方法仅设计用于主效应分析。现有的交互作用分析筛选方法通常受到严格模型假设的限制,缺乏稳健性,和/或要求预测变量是连续的(因此缺乏灵活性)。本文开发了一种专门针对交互作用分析的统一边际筛选方法,该方法可应用于回归、分类和生存分析。预测变量可以是连续的和离散的。所提出的方法基于基于信息熵的变异系数(CV)滤波器构建。严格建立了统计性质。结果表明,CV滤波器对预测变量的分布尾部、预测变量之间的相关结构和信号的稀疏水平几乎不敏感。开发了一种高效的两阶段算法,使所提出的方法能够扩展到超高维数据。模拟和对TCGA LUAD数据的分析进一步确立了所提出方法的实际优势。

相似文献

1
Unified model-free interaction screening via CV-entropy filter.通过CV熵滤波器进行统一的无模型相互作用筛选。
Comput Stat Data Anal. 2023 Apr;180. doi: 10.1016/j.csda.2022.107684. Epub 2022 Dec 28.
8
A selective overview of feature screening for ultrahigh-dimensional data.超高维数据特征筛选的选择性概述。
Sci China Math. 2015 Oct;58(10):2033-2054. doi: 10.1007/s11425-015-5062-9. Epub 2015 Aug 22.
10
Feature screening in ultrahigh-dimensional varying-coefficient Cox model.超高维变系数Cox模型中的特征筛选
J Multivar Anal. 2019 May;171:284-297. doi: 10.1016/j.jmva.2018.12.009. Epub 2018 Dec 28.

本文引用的文献

8
Part mutual information for quantifying direct associations in networks.用于量化网络中直接关联的部分互信息。
Proc Natl Acad Sci U S A. 2016 May 3;113(18):5130-5. doi: 10.1073/pnas.1522586113. Epub 2016 Apr 18.
10
A LASSO FOR HIERARCHICAL INTERACTIONS.用于分层交互的套索法
Ann Stat. 2013 Jun;41(3):1111-1141. doi: 10.1214/13-AOS1096.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验