用于癌症预后高维基因组数据的基于网络的稳健正则化和变量选择

Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.

作者信息

Ren Jie, Du Yinhao, Li Shaoyu, Ma Shuangge, Jiang Yu, Wu Cen

机构信息

Department of Statistics, Kansas State University, Manhattan, Kansas.

Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, North Carolina.

出版信息

Genet Epidemiol. 2019 Apr;43(3):276-291. doi: 10.1002/gepi.22194. Epub 2019 Feb 11.

DOI:10.1002/gepi.22194

PMID:30746793

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6446588/

Abstract

In cancer genomic studies, an important objective is to identify prognostic markers associated with patients' survival. Network-based regularization has achieved success in variable selections for high-dimensional cancer genomic data, because of its ability to incorporate the correlations among genomic features. However, as survival time data usually follow skewed distributions, and are contaminated by outliers, network-constrained regularization that does not take the robustness into account leads to false identifications of network structure and biased estimation of patients' survival. In this study, we develop a novel robust network-based variable selection method under the accelerated failure time model. Extensive simulation studies show the advantage of the proposed method over the alternative methods. Two case studies of lung cancer datasets with high-dimensional gene expression measurements demonstrate that the proposed approach has identified markers with important implications.

摘要

在癌症基因组研究中，一个重要目标是识别与患者生存相关的预后标志物。基于网络的正则化方法在高维癌症基因组数据的变量选择中取得了成功，因为它能够纳入基因组特征之间的相关性。然而，由于生存时间数据通常遵循偏态分布，并且受到异常值的影响，未考虑稳健性的网络约束正则化会导致网络结构的错误识别以及患者生存的偏差估计。在本研究中，我们在加速失效时间模型下开发了一种新颖的基于稳健网络的变量选择方法。大量模拟研究表明了所提出方法相对于其他方法的优势。对具有高维基因表达测量的肺癌数据集进行的两个案例研究表明，所提出的方法识别出了具有重要意义的标志物。

相似文献

Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.

Genet Epidemiol. 2019 Apr;43(3):276-291. doi: 10.1002/gepi.22194. Epub 2019 Feb 11.

Incorporating network structure in integrative analysis of cancer prognosis data.

Genet Epidemiol. 2013 Feb;37(2):173-83. doi: 10.1002/gepi.21697. Epub 2012 Nov 17.

NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.

Bioinformatics. 2015 Oct 15;31(20):3330-8. doi: 10.1093/bioinformatics/btv374. Epub 2015 Jun 18.

Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction.

Methods. 2017 Jul 15;124:100-107. doi: 10.1016/j.ymeth.2017.06.010. Epub 2017 Jun 13.

Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes.

BMC Genet. 2017 May 16;18(1):44. doi: 10.1186/s12863-017-0495-5.

Network-based drug sensitivity prediction.

BMC Med Genomics. 2020 Dec 28;13(Suppl 11):193. doi: 10.1186/s12920-020-00829-3.

Bayesian variable selection with graphical structure learning: Applications in integrative genomics.

PLoS One. 2018 Jul 30;13(7):e0195070. doi: 10.1371/journal.pone.0195070. eCollection 2018.

Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.

J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.

Integrative analysis of genetical genomics data incorporating network structures.

Biometrics. 2019 Dec;75(4):1063-1075. doi: 10.1111/biom.13072. Epub 2019 Apr 29.

Integrative Molecular Analyses of an Individual Transcription Factor-Based Genomic Model for Lung Cancer Prognosis.

Dis Markers. 2021 Dec 7;2021:5125643. doi: 10.1155/2021/5125643. eCollection 2021.

引用本文的文献

A Comprehensive Review of Deep Learning Applications with Multi-Omics Data in Cancer Research.

Genes (Basel). 2025 May 28;16(6):648. doi: 10.3390/genes16060648.

Efficient blockLASSO for polygenic scores with applications to all of us and UK Biobank.

BMC Genomics. 2025 Mar 27;26(1):302. doi: 10.1186/s12864-025-11505-0.

MMOSurv: meta-learning for few-shot survival analysis with multi-omics data.

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae684.

The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies.

Stat Med. 2024 Nov 20;43(26):4928-4983. doi: 10.1002/sim.10196. Epub 2024 Sep 11.

Bayesian functional analysis for untargeted metabolomics data with matching uncertainty and small sample sizes.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae141.

Integrating DNA methylation and gene expression data in a single gene network using the iNETgrate package.

Sci Rep. 2023 Dec 8;13(1):21721. doi: 10.1038/s41598-023-48237-8.

Identification and validation of a DNA methylation-driven gene-based prognostic model for clear cell renal cell carcinoma.

BMC Genomics. 2023 Jun 7;24(1):307. doi: 10.1186/s12864-023-09416-z.

Identification of an individualized therapy prognostic signature for head and neck squamous cell carcinoma.

BMC Genomics. 2023 Apr 28;24(1):221. doi: 10.1186/s12864-023-09325-1.

Springer: An R package for bi-level variable selection of high-dimensional longitudinal data.

Front Genet. 2023 Apr 6;14:1088223. doi: 10.3389/fgene.2023.1088223. eCollection 2023.

Gene Screening in High-Throughput Right-Censored Lung Cancer Data.

Onco (Basel). 2022 Dec;2(4):305-318. doi: 10.3390/onco2040017. Epub 2022 Oct 17.

本文引用的文献

The microtubule-associated protein PRC1 is a potential therapeutic target for lung cancer.

Oncotarget. 2017 Dec 22;9(4):4985-4997. doi: 10.18632/oncotarget.23577. eCollection 2018 Jan 12.

Dissecting gene-environment interactions: A penalized robust approach accounting for hierarchical structures.

Stat Med. 2018 Feb 10;37(3):437-456. doi: 10.1002/sim.7518. Epub 2017 Oct 16.

Genome-scale analysis identifies NEK2, DLGAP5 and ECT2 as promising diagnostic and prognostic biomarkers in human lung cancer.

Sci Rep. 2017 Aug 14;7(1):8072. doi: 10.1038/s41598-017-08615-5.

PRC1 contributes to tumorigenesis of lung adenocarcinoma in association with the Wnt/β-catenin signaling pathway.

Mol Cancer. 2017 Jun 24;16(1):108. doi: 10.1186/s12943-017-0682-z.

Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes.

BMC Genet. 2017 May 16;18(1):44. doi: 10.1186/s12863-017-0495-5.

Mining expression and prognosis of topoisomerase isoforms in non-small-cell lung cancer by using Oncomine and Kaplan-Meier plotter.

PLoS One. 2017 Mar 29;12(3):e0174515. doi: 10.1371/journal.pone.0174515. eCollection 2017.

PRPS1 silencing reverses cisplatin resistance in human breast cancer cells.

Biochem Cell Biol. 2017 Jun;95(3):385-393. doi: 10.1139/bcb-2016-0106. Epub 2016 Nov 3.

Network-Regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery.

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):944-953. doi: 10.1109/TCBB.2016.2640303. Epub 2016 Dec 15.

AURKA, DLGAP5, TPX2, KIF11 and CKAP5: Five specific mitosis-associated genes correlate with poor prognosis for non-small cell lung cancer patients.

Int J Oncol. 2017 Feb;50(2):365-372. doi: 10.3892/ijo.2017.3834. Epub 2017 Jan 2.

Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking.

Nat Genet. 2017 Jan;49(1):65-74. doi: 10.1038/ng.3722. Epub 2016 Nov 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

用于癌症预后高维基因组数据的基于网络的稳健正则化和变量选择

Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

用于癌症预后高维基因组数据的基于网络的稳健正则化和变量选择

Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献