Suppr超能文献

结合最小绝对收缩和选择算子(LASSO)与主成分分析用于全基因组关联研究中基因-基因相互作用的检测

Combining least absolute shrinkage and selection operator (LASSO) and principal-components analysis for detection of gene-gene interactions in genome-wide association studies.

作者信息

D'Angelo Gina M, Rao Dc, Gu C Charles

机构信息

Division of Biostatistics, Washington University School of Medicine, 660 South Euclid Avenue, St, Louis, Missouri 63110, USA.

出版信息

BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S62. doi: 10.1186/1753-6561-3-s7-s62.

Abstract

Variable selection in genome-wide association studies can be a daunting task and statistically challenging because there are more variables than subjects. We propose an approach that uses principal-component analysis (PCA) and least absolute shrinkage and selection operator (LASSO) to identify gene-gene interaction in genome-wide association studies. A PCA was used to first reduce the dimension of the single-nucleotide polymorphisms (SNPs) within each gene. The interaction of the gene PCA scores were placed into LASSO to determine whether any gene-gene signals exist. We have extended the PCA-LASSO approach using the bootstrap to estimate the standard errors and confidence intervals of the LASSO coefficient estimates. This method was compared to placing the raw SNP values into the LASSO and the logistic model with individual gene-gene interaction. We demonstrated these methods with the Genetic Analysis Workshop 16 rheumatoid arthritis genome-wide association study data and our results identified a few gene-gene signals. Based on our results, the PCA-LASSO method shows promise in identifying gene-gene interactions, and, at this time we suggest using it with other conventional approaches, such as generalized linear models, to narrow down genetic signals.

摘要

在全基因组关联研究中进行变量选择可能是一项艰巨的任务,并且在统计学上具有挑战性,因为变量比研究对象更多。我们提出了一种方法,该方法使用主成分分析(PCA)和最小绝对收缩与选择算子(LASSO)来识别全基因组关联研究中的基因-基因相互作用。首先使用PCA来降低每个基因内单核苷酸多态性(SNP)的维度。将基因PCA得分的相互作用纳入LASSO,以确定是否存在任何基因-基因信号。我们使用自助法扩展了PCA-LASSO方法,以估计LASSO系数估计值的标准误差和置信区间。将该方法与将原始SNP值纳入LASSO以及具有个体基因-基因相互作用的逻辑模型进行了比较。我们用遗传分析研讨会16类风湿性关节炎全基因组关联研究数据展示了这些方法,我们的结果识别出了一些基因-基因信号。基于我们的结果,PCA-LASSO方法在识别基因-基因相互作用方面显示出前景,并且此时我们建议将其与其他传统方法(如广义线性模型)一起使用,以缩小遗传信号范围。

相似文献

2
Fast and efficient correction for population stratification in multi-locus genome-wide association studies.
Genetica. 2021 Dec;149(5-6):313-325. doi: 10.1007/s10709-021-00129-3. Epub 2021 Sep 4.
6
Evaluation of the lasso and the elastic net in genome-wide association studies.
Front Genet. 2013 Dec 4;4:270. doi: 10.3389/fgene.2013.00270. eCollection 2013.
7
Regularized regression method for genome-wide association studies.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S67. doi: 10.1186/1753-6561-5-S9-S67.
8
A novel genomic selection method combining GBLUP and LASSO.
Genetica. 2015 Jun;143(3):299-304. doi: 10.1007/s10709-015-9826-5. Epub 2015 Feb 6.
9
An Efficient Genome-Wide Multilocus Epistasis Search.
Genetics. 2015 Nov;201(3):865-70. doi: 10.1534/genetics.115.182444. Epub 2015 Sep 23.

引用本文的文献

1
Multi-omics decodes host-specific and environmental microbiome interactions in sepsis.
Front Microbiol. 2025 Jun 26;16:1618177. doi: 10.3389/fmicb.2025.1618177. eCollection 2025.
4
Modeling of new markers for the diagnosis and prognosis of pancreatic cancer based on the transition from inflammation to cancer.
Transl Cancer Res. 2024 Mar 31;13(3):1425-1442. doi: 10.21037/tcr-23-1365. Epub 2024 Mar 27.
5
Lipid metabolism-related gene expression in the immune microenvironment predicts prognostic outcomes in renal cell carcinoma.
Front Immunol. 2023 Nov 27;14:1324205. doi: 10.3389/fimmu.2023.1324205. eCollection 2023.
6
Inferring circadian gene regulatory relationships from gene expression data with a hybrid framework.
BMC Bioinformatics. 2023 Sep 26;24(1):362. doi: 10.1186/s12859-023-05458-y.
7
A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction.
Front Bioinform. 2022 Jun 27;2:927312. doi: 10.3389/fbinf.2022.927312. eCollection 2022.
8
Ensemble learning for detecting gene-gene interactions in colorectal cancer.
PeerJ. 2018 Oct 29;6:e5854. doi: 10.7717/peerj.5854. eCollection 2018.
9
Eigen-Epistasis for detecting gene-gene interactions.
BMC Bioinformatics. 2017 Jan 23;18(1):54. doi: 10.1186/s12859-017-1488-0.
10
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.
Front Neurosci. 2016 Jul 28;10:344. doi: 10.3389/fnins.2016.00344. eCollection 2016.

本文引用的文献

1
Detecting disease-causing genes by LASSO-Patternsearch algorithm.
BMC Proc. 2007;1 Suppl 1(Suppl 1):S60. doi: 10.1186/1753-6561-1-s1-s60. Epub 2007 Dec 18.
2
Accommodating linkage disequilibrium in genetic-association analyses via ridge regression.
Am J Hum Genet. 2008 Feb;82(2):375-85. doi: 10.1016/j.ajhg.2007.10.012.
3
PTPN22 genetic variation: evidence for multiple variants associated with rheumatoid arthritis.
Am J Hum Genet. 2005 Oct;77(4):567-81. doi: 10.1086/468189. Epub 2005 Aug 10.
4
A review of the MHC genetics of rheumatoid arthritis.
Genes Immun. 2004 May;5(3):151-7. doi: 10.1038/sj.gene.6364045.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验