Suppr超能文献

重叠组逻辑回归及其在遗传通路选择中的应用

Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection.

作者信息

Zeng Yaohui, Breheny Patrick

机构信息

Department of Biostatistics, University of Iowa, Iowa City, IA, USA.

出版信息

Cancer Inform. 2016 Sep 15;15:179-87. doi: 10.4137/CIN.S40043. eCollection 2016.

Abstract

Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data.

摘要

长期以来,在全基因组表达分析中,发现导致感兴趣表型的重要基因一直是一项挑战。诸如基因集富集分析(GSEA)等纳入通路信息的分析方法在假设检验中已广泛应用,但由于处理重叠通路存在挑战且缺乏可用软件,基于通路的方法在回归方法中基本未被采用。R包grpreg被广泛用于拟合组套索和其他组惩罚回归模型;在本研究中,我们开发了一个扩展包grpregOverlap,通过潜在变量方法允许使用重叠组结构。我们使用模拟数据和真实数据将此方法与普通套索和GSEA进行比较。我们发现纳入先验通路信息可显著提高基因表达分类器的准确性,并且我们阐明了在通路数据分析方面,诸如GSEA等假设检验方法与回归方法不同的几种方式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bba2/5026200/d0b175d94b03/cin-15-2016-179f1.jpg

相似文献

1
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection.
Cancer Inform. 2016 Sep 15;15:179-87. doi: 10.4137/CIN.S40043. eCollection 2016.
2
Genome-wide association analysis by lasso penalized logistic regression.
Bioinformatics. 2009 Mar 15;25(6):714-21. doi: 10.1093/bioinformatics/btp041. Epub 2009 Jan 28.
3
FUNNEL-GSEA: FUNctioNal ELastic-net regression in time-course gene set enrichment analysis.
Bioinformatics. 2017 Jul 1;33(13):1944-1952. doi: 10.1093/bioinformatics/btx104.
4
Investigating unique genes of five molecular subtypes of breast cancer using penalized logistic regression.
J Cancer Res Ther. 2023 Apr;19(Supplement):S126-S137. doi: 10.4103/jcrt.jcrt_811_21.
7
Variable Selection with Prior Information for Generalized Linear Models via the Prior LASSO Method.
J Am Stat Assoc. 2016;111(513):355-376. doi: 10.1080/01621459.2015.1008363. Epub 2016 May 5.
8
Identification of clinically relevant features in hypertensive patients using penalized regression: a case study of cardiovascular events.
Med Biol Eng Comput. 2019 Sep;57(9):2011-2026. doi: 10.1007/s11517-019-02007-9. Epub 2019 Jul 25.
9
Optimism Bias Correction in Omics Studies with Big Data: Assessment of Penalized Methods on Simulated Data.
OMICS. 2019 Apr;23(4):207-213. doi: 10.1089/omi.2018.0191. Epub 2019 Feb 22.
10
Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models.
BMC Bioinformatics. 2020 Jul 2;21(1):277. doi: 10.1186/s12859-020-03618-y.

引用本文的文献

1
Weighted overlapping group lasso for integrating prior network knowledge into gene set analysis.
BMC Bioinformatics. 2025 Sep 1;26(1):226. doi: 10.1186/s12859-025-06170-9.
9
Multi-stage adaptive enrichment trial design with subgroup estimation.
J Biopharm Stat. 2020 Nov 1;30(6):1038-1049. doi: 10.1080/10543406.2020.1832109. Epub 2020 Oct 18.
10
Adaptive group-regularized logistic elastic net regression.
Biostatistics. 2021 Oct 13;22(4):723-737. doi: 10.1093/biostatistics/kxz062.

本文引用的文献

1
Moment based gene set tests.
BMC Bioinformatics. 2015 Apr 28;16:132. doi: 10.1186/s12859-015-0571-7.
3
The limitations of simple gene set enrichment analysis assuming gene independence.
Stat Methods Med Res. 2016 Feb;25(1):472-87. doi: 10.1177/0962280212460441. Epub 2012 Oct 14.
4
ROAST: rotation gene set tests for complex microarray experiments.
Bioinformatics. 2010 Sep 1;26(17):2176-82. doi: 10.1093/bioinformatics/btq401. Epub 2010 Jul 7.
5
Gene set enrichment analysis using linear models and diagnostics.
Bioinformatics. 2008 Nov 15;24(22):2586-91. doi: 10.1093/bioinformatics/btn465. Epub 2008 Sep 11.
6
Gene-set approach for expression pattern analysis.
Brief Bioinform. 2008 May;9(3):189-97. doi: 10.1093/bib/bbn001. Epub 2008 Jan 17.
8
Improving gene set analysis of microarray data by SAM-GS.
BMC Bioinformatics. 2007 Jul 5;8:242. doi: 10.1186/1471-2105-8-242.
9
Analyzing gene expression data in terms of gene sets: methodological issues.
Bioinformatics. 2007 Apr 15;23(8):980-7. doi: 10.1093/bioinformatics/btm051. Epub 2007 Feb 15.
10
Nonparametric pathway-based regression models for analysis of genomic data.
Biostatistics. 2007 Apr;8(2):265-84. doi: 10.1093/biostatistics/kxl007. Epub 2006 Jun 13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验