Suppr超能文献

用于基因组数据分析的网络正则化高维Cox回归

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA.

作者信息

Sun Hokeun, Lin Wei, Feng Rui, Li Hongzhe

机构信息

Pusan National University.

University of Pennsylvania.

出版信息

Stat Sin. 2014 Jul;24(3):1433-1459. doi: 10.5705/ss.2012.317.

Abstract

We consider estimation and variable selection in high-dimensional Cox regression when a prior knowledge of the relationships among the covariates, described by a network or graph, is available. A limitation of the existing methodology for survival analysis with high-dimensional genomic data is that a wealth of structural information about many biological processes, such as regulatory networks and pathways, has often been ignored. In order to incorporate such prior network information into the analysis of genomic data, we propose a network-based regularization method for high-dimensional Cox regression; it uses an ℓ-penalty to induce sparsity of the regression coefficients and a quadratic Laplacian penalty to encourage smoothness between the coefficients of neighboring variables on a given network. The proposed method is implemented by an efficient coordinate descent algorithm. In the setting where the dimensionality can grow exponentially fast with the sample size , we establish model selection consistency and estimation bounds for the proposed estimators. The theoretical results provide insights into the gain from taking into account the network structural information. Extensive simulation studies indicate that our method outperforms Lasso and elastic net in terms of variable selection accuracy and stability. We apply our method to a breast cancer gene expression study and identify several biologically plausible subnetworks and pathways that are associated with breast cancer distant metastasis.

摘要

当可以获得由网络或图描述的协变量之间关系的先验知识时,我们考虑高维Cox回归中的估计和变量选择。现有用于高维基因组数据生存分析方法的一个局限性在于,关于许多生物过程(如调控网络和信号通路)的大量结构信息常常被忽视。为了将此类先验网络信息纳入基因组数据分析,我们提出了一种用于高维Cox回归的基于网络的正则化方法;它使用ℓ惩罚来诱导回归系数的稀疏性,并使用二次拉普拉斯惩罚来鼓励给定网络上相邻变量系数之间的平滑性。所提出的方法通过一种高效的坐标下降算法实现。在维度可以随样本量呈指数快速增长的情况下,我们为所提出的估计器建立了模型选择一致性和估计界。理论结果为考虑网络结构信息所带来的收益提供了见解。广泛的模拟研究表明,我们的方法在变量选择准确性和稳定性方面优于Lasso和弹性网络。我们将我们的方法应用于一项乳腺癌基因表达研究,并识别出几个与乳腺癌远处转移相关的具有生物学合理性的子网和信号通路。

相似文献

1
NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA.
Stat Sin. 2014 Jul;24(3):1433-1459. doi: 10.5705/ss.2012.317.
2
The L regularization network Cox model for analysis of genomic data.
Comput Biol Med. 2018 Sep 1;100:203-208. doi: 10.1016/j.compbiomed.2018.07.009. Epub 2018 Jul 17.
3
The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression.
Ann Stat. 2011;39(4):2021-2046. doi: 10.1214/11-aos897.
7
Network-constrained regularization and variable selection for analysis of genomic data.
Bioinformatics. 2008 May 1;24(9):1175-82. doi: 10.1093/bioinformatics/btn081. Epub 2008 Mar 1.
8
Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification.
BMC Bioinformatics. 2013 Jun 19;14:198. doi: 10.1186/1471-2105-14-198.
9
ADAPTIVE ROBUST VARIABLE SELECTION.
Ann Stat. 2014 Feb 1;42(1):324-351. doi: 10.1214/13-AOS1191.
10
glmgraph: an R package for variable selection and predictive modeling of structured genomic data.
Bioinformatics. 2015 Dec 15;31(24):3991-3. doi: 10.1093/bioinformatics/btv497. Epub 2015 Aug 26.

引用本文的文献

1
Network-based multi-class classifier to identify optimized gene networks for acute leukemia cell line classification.
PLoS One. 2025 May 8;20(5):e0321549. doi: 10.1371/journal.pone.0321549. eCollection 2025.
2
Sparse spectral graph analysis and its application to gastric cancer drug resistance-specific molecular interplays identification.
PLoS One. 2024 Jul 5;19(7):e0305386. doi: 10.1371/journal.pone.0305386. eCollection 2024.
5
Topology-based radiomic features for prediction of parotid gland cancer malignancy grade in magnetic resonance images.
MAGMA. 2023 Oct;36(5):767-777. doi: 10.1007/s10334-023-01084-0. Epub 2023 Apr 20.
6
Regularized regression when covariates are linked on a network: the 3CoSE algorithm.
J Appl Stat. 2021 Oct 7;50(3):535-554. doi: 10.1080/02664763.2021.1982878. eCollection 2023.
7
Computational Tactics for Precision Cancer Network Biology.
Int J Mol Sci. 2022 Nov 19;23(22):14398. doi: 10.3390/ijms232214398.
9
Network-based survival analysis to discover target genes for developing cancer immunotherapies and predicting patient survival.
J Appl Stat. 2021;48(8):1352-1373. doi: 10.1080/02664763.2020.1812543. Epub 2020 Sep 3.
10
Knowledge-Guided Statistical Learning Methods for Analysis of High-Dimensional -Omics Data in Precision Oncology.
JCO Precis Oncol. 2019 Oct 24;3. doi: 10.1200/PO.19.00018. eCollection 2019 Oct.

本文引用的文献

1
Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.
J Stat Softw. 2011 Mar;39(5):1-13. doi: 10.18637/jss.v039.i05.
2
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Stat Sin. 2014 Jan 1;24(1):25-42. doi: 10.5705/ss.2012.240.
3
ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL.
Ann Stat. 2013 Jun 1;41(3):1142-1165. doi: 10.1214/13-AOS1098.
5
REGULARIZATION FOR COX'S PROPORTIONAL HAZARDS MODEL WITH NP-DIMENSIONALITY.
Ann Stat. 2011;39(6):3092-3120. doi: 10.1214/11-AOS911.
7
The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression.
Ann Stat. 2011;39(4):2021-2046. doi: 10.1214/11-aos897.
10
One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.
Ann Stat. 2008 Aug 1;36(4):1509-1533. doi: 10.1214/009053607000000802.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验