Suppr超能文献

广义线性模型中高维变量选择的特征筛选

Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.

作者信息

Jiang Jinzhu, Shang Junfeng

机构信息

Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA.

出版信息

Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.

Abstract

The two-stage feature screening method for linear models applies dimension reduction at first stage to screen out nuisance features and dramatically reduce the dimension to a moderate size; at the second stage, penalized methods such as LASSO and SCAD could be applied for feature selection. A majority of subsequent works on the sure independent screening methods have focused mainly on the linear model. This motivates us to extend the independence screening method to generalized linear models, and particularly with binary response by using the point-biserial correlation. We develop a two-stage feature screening method called point-biserial sure independence screening (PB-SIS) for high-dimensional generalized linear models, aiming for high selection accuracy and low computational cost. We demonstrate that PB-SIS is a feature screening method with high efficiency. The PB-SIS method possesses the sure independence property under certain regularity conditions. A set of simulation studies are conducted and confirm the sure independence property and the accuracy and efficiency of PB-SIS. Finally we apply PB-SIS to one real data example to show its effectiveness.

摘要

线性模型的两阶段特征筛选方法在第一阶段进行降维,以筛选出干扰特征并将维度大幅缩减至适中大小;在第二阶段,可应用诸如LASSO和SCAD等惩罚方法进行特征选择。后续大多数关于确定独立筛选方法的工作主要集中在线性模型上。这促使我们将独立筛选方法扩展到广义线性模型,特别是对于二元响应,通过使用点二列相关来实现。我们针对高维广义线性模型开发了一种名为点二列确定独立筛选(PB-SIS)的两阶段特征筛选方法,旨在实现高选择准确性和低计算成本。我们证明了PB-SIS是一种高效的特征筛选方法。PB-SIS方法在某些正则条件下具有确定独立性属性。进行了一组模拟研究,证实了PB-SIS的确定独立性属性以及准确性和效率。最后,我们将PB-SIS应用于一个实际数据示例以展示其有效性。

相似文献

2
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
4
The Sparse MLE for Ultra-High-Dimensional Feature Screening.超高维特征筛选的稀疏极大似然估计
J Am Stat Assoc. 2014;109(507):1257-1269. doi: 10.1080/01621459.2013.879531.
8
A selective overview of feature screening for ultrahigh-dimensional data.超高维数据特征筛选的选择性概述。
Sci China Math. 2015 Oct;58(10):2033-2054. doi: 10.1007/s11425-015-5062-9. Epub 2015 Aug 22.

引用本文的文献

1
Serum-biomarker-based population screening model for hepatocellular carcinoma.基于血清生物标志物的肝细胞癌人群筛查模型
iScience. 2025 Feb 8;28(3):111981. doi: 10.1016/j.isci.2025.111981. eCollection 2025 Mar 21.

本文引用的文献

1
MODEL-FREE FORWARD SCREENING VIA CUMULATIVE DIVERGENCE.基于累积散度的无模型前向筛选
J Am Stat Assoc. 2020;115(531):1393-1405. doi: 10.1080/01621459.2019.1632078. Epub 2019 Jul 22.
9
Variable Selection using MM Algorithms.使用MM算法进行变量选择
Ann Stat. 2005;33(4):1617-1642. doi: 10.1214/009053605000000200.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验