• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

高维问题中惩罚函数的设计:调整参数的作用。

Designing penalty functions in high dimensional problems: The role of tuning parameters.

作者信息

Chen Ting-Huei, Sun Wei, Fine Jason P

机构信息

Department of Mathematics and Statistics, Laval University, Quebec, QC G1V0A6, Canada.

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.

出版信息

Electron J Stat. 2016;10(2):2312-2328. doi: 10.1214/16-EJS1169. Epub 2016 Aug 29.

DOI:10.1214/16-EJS1169
PMID:28989558
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5628772/
Abstract

Various forms of penalty functions have been developed for regularized estimation and variable selection. Screening approaches are often used to reduce the number of covariate before penalized estimation. However, in certain problems, the number of covariates remains large after screening. For example, in genome-wide association (GWA) studies, the purpose is to identify Single Nucleotide Polymorphisms (SNPs) that are associated with certain traits, and typically there are millions of SNPs and thousands of samples. Because of the strong correlation of nearby SNPs, screening can only reduce the number of SNPs from millions to tens of thousands and the variable selection problem remains very challenging. Several penalty functions have been proposed for such high dimensional data. However, it is unclear which class of penalty functions is the appropriate choice for a particular application. In this paper, we conduct a theoretical analysis to relate the ranges of tuning parameters of various penalty functions with the dimensionality of the problem and the minimum effect size. We exemplify our theoretical results in several penalty functions. The results suggest that a class of penalty functions that bridges and penalties requires less restrictive conditions on dimensionality and minimum effect sizes in order to attain the two fundamental goals of penalized estimation: to penalize all the noise to be zero and to obtain unbiased estimation of the true signals. The penalties such as SICA and Log belong to this class, but they have not been used often in applications. The simulation and real data analysis using GWAS data suggest the promising applicability of such class of penalties.

摘要

为了进行正则化估计和变量选择,人们开发了各种形式的惩罚函数。筛选方法通常用于在惩罚估计之前减少协变量的数量。然而,在某些问题中,筛选后协变量的数量仍然很大。例如,在全基因组关联(GWA)研究中,目的是识别与某些性状相关的单核苷酸多态性(SNP),通常有数百万个SNP和数千个样本。由于附近SNP的强相关性,筛选只能将SNP的数量从数百万减少到数万,变量选择问题仍然非常具有挑战性。针对此类高维数据,已经提出了几种惩罚函数。然而,尚不清楚哪类惩罚函数是特定应用的合适选择。在本文中,我们进行了理论分析,以将各种惩罚函数的调优参数范围与问题的维度和最小效应大小联系起来。我们在几个惩罚函数中举例说明了我们的理论结果。结果表明,一类桥接 和 惩罚的惩罚函数在维度和最小效应大小方面需要较少的限制条件,以便实现惩罚估计的两个基本目标:将所有噪声惩罚为零,并获得真实信号的无偏估计。诸如SICA和Log之类的惩罚属于此类,但它们在应用中并不经常使用。使用GWAS数据的模拟和实际数据分析表明了这类惩罚的应用前景。

相似文献

1
Designing penalty functions in high dimensional problems: The role of tuning parameters.高维问题中惩罚函数的设计:调整参数的作用。
Electron J Stat. 2016;10(2):2312-2328. doi: 10.1214/16-EJS1169. Epub 2016 Aug 29.
2
Efficient ℓ -norm feature selection based on augmented and penalized minimization.基于增广和惩罚最小化的高效 ℓ -范数特征选择。
Stat Med. 2018 Feb 10;37(3):473-486. doi: 10.1002/sim.7526. Epub 2017 Oct 30.
3
Non-Concave Penalized Likelihood with NP-Dimensionality.具有NP维数的非凹惩罚似然法
IEEE Trans Inf Theory. 2011 Aug;57(8):5467-5484. doi: 10.1109/TIT.2011.2158486.
4
Variable selection under multicollinearity using modified log penalty.使用修正对数罚函数在多重共线性下进行变量选择。
J Appl Stat. 2019 Jul 3;47(2):201-230. doi: 10.1080/02664763.2019.1637829. eCollection 2020.
5
Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models.基于坐标下降法的凹惩罚广义线性模型的优化最小化
Stat Comput. 2014 Sep;24(5):871-883. doi: 10.1007/s11222-013-9407-3.
6
Tuning Parameter Selection in Cox Proportional Hazards Model with a Diverging Number of Parameters.参数数量发散的Cox比例风险模型中的调优参数选择
Scand Stat Theory Appl. 2018 Sep;45(3):557-570. doi: 10.1111/sjos.12313. Epub 2018 Jan 16.
7
Variable selection and estimation in generalized linear models with the seamless penalty.具有无缝惩罚的广义线性模型中的变量选择与估计
Can J Stat. 2012 Dec;40(4):745-769. doi: 10.1002/cjs.11165.
8
L0-regularized time-varying sparse inverse covariance estimation for tracking dynamic fMRI brain networks.用于跟踪动态功能磁共振成像脑网络的 L0 正则化时变稀疏逆协方差估计
Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug;2015:1496-9. doi: 10.1109/EMBC.2015.7318654.
9
Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection.用于超高维变量选择的惩罚复合拟似然法
J R Stat Soc Series B Stat Methodol. 2011 Jun;73(3):325-349. doi: 10.1111/j.1467-9868.2010.00764.x.
10
Performance of penalized maximum likelihood in estimation of genetic covariances matrices.惩罚最大似然估计在遗传协方差矩阵估计中的性能。
Genet Sel Evol. 2011 Nov 27;43(1):39. doi: 10.1186/1297-9686-43-39.

引用本文的文献

1
A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information.一种基于全基因组关联研究的汇总统计数据构建多基因风险模型并纳入外部信息的惩罚回归框架。
J Am Stat Assoc. 2021;116(533):133-143. doi: 10.1080/01621459.2020.1764849. Epub 2020 Oct 12.
2
Prediction of cancer drug sensitivity using high-dimensional omic features.利用高维组学特征预测癌症药物敏感性
Biostatistics. 2017 Jan;18(1):1-14. doi: 10.1093/biostatistics/kxw022. Epub 2016 Jun 20.
3
IsoDOT Detects Differential RNA-isoform Expression/Usage with respect to a Categorical or Continuous Covariate with High Sensitivity and Specificity.IsoDOT以高灵敏度和特异性检测与分类或连续协变量相关的RNA异构体差异表达/使用情况。
J Am Stat Assoc. 2015;110(511):975-986. doi: 10.1080/01621459.2015.1040880. Epub 2015 Nov 7.
4
PenPC: A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs.PenPC:一种估计高维有向无环图骨架的两步法。
Biometrics. 2016 Mar;72(1):146-55. doi: 10.1111/biom.12415. Epub 2015 Sep 25.

本文引用的文献

1
: Coordinate Descent With Nonconvex Penalties.带非凸惩罚项的坐标下降法
J Am Stat Assoc. 2011;106(495):1125-1138. doi: 10.1198/jasa.2011.tm09738.
2
CALIBRATING NON-CONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION.超高维情形下非凸惩罚回归的校准
Ann Stat. 2013 Oct 1;41(5):2505-2536. doi: 10.1214/13-AOS1159.
3
Heritability and genomics of gene expression in peripheral blood.外周血基因表达的遗传力和基因组学。
Nat Genet. 2014 May;46(5):430-7. doi: 10.1038/ng.2951. Epub 2014 Apr 13.
4
On constrained and regularized high-dimensional regression.关于约束与正则化高维回归
Ann Inst Stat Math. 2013 Oct;65(5):807-832. doi: 10.1007/s10463-012-0396-3.
5
Regulation of neuroblastoma differentiation by forkhead transcription factors FOXO1/3/4 through the receptor tyrosine kinase PDGFRA.叉头转录因子 FOXO1/3/4 通过受体酪氨酸激酶 PDGFRA 调节神经母细胞瘤分化。
Proc Natl Acad Sci U S A. 2012 Mar 27;109(13):4898-903. doi: 10.1073/pnas.1119535109. Epub 2012 Mar 12.
6
Non-Concave Penalized Likelihood with NP-Dimensionality.具有NP维数的非凹惩罚似然法
IEEE Trans Inf Theory. 2011 Aug;57(8):5467-5484. doi: 10.1109/TIT.2011.2158486.
7
COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.用于非凸惩罚回归的坐标下降算法及其在生物特征选择中的应用
Ann Appl Stat. 2011 Jan 1;5(1):232-253. doi: 10.1214/10-AOAS388.
8
A Selective Overview of Variable Selection in High Dimensional Feature Space.高维特征空间中变量选择的选择性概述
Stat Sin. 2010 Jan;20(1):101-148.
9
Genomewide multiple-loci mapping in experimental crosses by iterative adaptive penalized regression.基于迭代自适应惩罚回归的实验杂交中全基因组多位点映射。
Genetics. 2010 May;185(1):349-59. doi: 10.1534/genetics.110.114280. Epub 2010 Feb 15.
10
One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.非凹惩罚似然模型中的一步稀疏估计
Ann Stat. 2008 Aug 1;36(4):1509-1533. doi: 10.1214/009053607000000802.