基于惩罚回归的多位点关联分析。

Multilocus association testing with penalized regression.

机构信息

Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA.

出版信息

Genet Epidemiol. 2011 Dec;35(8):755-65. doi: 10.1002/gepi.20625. Epub 2011 Sep 15.

DOI:10.1002/gepi.20625

PMID:21922539

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3350336/

Abstract

In multilocus association analysis, since some markers may not be associated with a trait, it seems attractive to use penalized regression with the capability of automatic variable selection. On the other hand, in spite of a rapidly growing body of literature on penalized regression, most focus on variable selection and outcome prediction, for which penalized methods are generally more effective than their nonpenalized counterparts. However, for statistical inference, i.e. hypothesis testing and interval estimation, it is less clear how penalized methods would perform, or even how to best apply them, largely due to lack of studies on this topic. In our motivating data for a cohort of kidney transplant recipients, it is of primary interest to assess whether a group of genetic variants are associated with a binary clinical outcome, acute rejection at 6 months. In this article, we study some technical issues and alternative implementations of hypothesis testing in Lasso penalized logistic regression, and compare their performance with each other and with several existing global tests, some of which are specifically designed as variance component tests for high-dimensional data. The most interesting, and perhaps surprising, conclusion of this study is that, for low to moderately high-dimensional data, statistical tests based on Lasso penalized regression are not necessarily more powerful than some existing global tests. In addition, in penalized regression, rather than building a test based on a single selected "best" model, combining multiple tests, each of which is built on a candidate model, might be more promising.

摘要

在多基因座关联分析中，由于一些标记可能与性状不相关，因此使用具有自动变量选择功能的惩罚回归似乎很有吸引力。另一方面，尽管惩罚回归的文献数量迅速增加，但大多数都集中在变量选择和结果预测上，在这些方面，惩罚方法通常比非惩罚方法更有效。然而，对于统计推断，即假设检验和区间估计，惩罚方法的表现如何，甚至如何最好地应用它们，都不太清楚，这主要是由于缺乏对此主题的研究。在我们对一组肾移植受者队列的激励数据中，主要关注的是评估一组遗传变异是否与 6 个月时的急性排斥反应等二元临床结果相关。在本文中，我们研究了 Lasso 惩罚逻辑回归中假设检验的一些技术问题和替代实现，并将它们的性能彼此进行了比较，也与几种现有的全局检验进行了比较，其中一些检验是专门为高维数据设计的方差分量检验。这项研究最有趣、也许也是最令人惊讶的结论是，对于低到中等维度的数据，基于 Lasso 惩罚回归的统计检验不一定比一些现有的全局检验更有效。此外，在惩罚回归中，与其基于单个选定的“最佳”模型构建检验，不如组合多个检验，每个检验都是基于候选模型构建的，可能更有前途。

相似文献

Multilocus association testing with penalized regression.

Genet Epidemiol. 2011 Dec;35(8):755-65. doi: 10.1002/gepi.20625. Epub 2011 Sep 15.

Penalized regression approaches to testing for quantitative trait-rare variant association.

Front Genet. 2014 May 13;5:121. doi: 10.3389/fgene.2014.00121. eCollection 2014.

SNP selection in genome-wide and candidate gene studies via penalized logistic regression.

Genet Epidemiol. 2010 Dec;34(8):879-91. doi: 10.1002/gepi.20543.

A screening-testing approach for detecting gene-environment interactions using sequential penalized and unpenalized multiple logistic regression.

Pac Symp Biocomput. 2015:183-94.

Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data.

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad063.

Evaluation of Penalized and Nonpenalized Methods for Disease Prediction with Large-Scale Genetic Data.

Biomed Res Int. 2015;2015:605891. doi: 10.1155/2015/605891. Epub 2015 Aug 4.

-Penalized Multinomial Regression: Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes.

Stat Med. 2024 Dec 30;43(30):5711-5747. doi: 10.1002/sim.10263. Epub 2024 Nov 12.

A Novel Statistic for Global Association Testing Based on Penalized Regression.

Genet Epidemiol. 2015 Sep;39(6):415-26. doi: 10.1002/gepi.21915.

Penalized linear mixed models for structured genetic data.

Genet Epidemiol. 2021 Jul;45(5):427-444. doi: 10.1002/gepi.22384. Epub 2021 May 16.

Resampling-based tests for Lasso in genome-wide association studies.

BMC Genet. 2017 Jul 24;18(1):70. doi: 10.1186/s12863-017-0533-3.

引用本文的文献

Prioritizing genetic variants in GWAS with lasso using permutation-assisted tuning.

Bioinformatics. 2020 Jun 1;36(12):3811-3817. doi: 10.1093/bioinformatics/btaa229.

Polygenic risk score in postmortem diagnosed sporadic early-onset Alzheimer's disease.

Neurobiol Aging. 2018 Feb;62:244.e1-244.e8. doi: 10.1016/j.neurobiolaging.2017.09.035. Epub 2017 Oct 10.

Penalized regression approaches to testing for quantitative trait-rare variant association.

Front Genet. 2014 May 13;5:121. doi: 10.3389/fgene.2014.00121. eCollection 2014.

Regularized rare variant enrichment analysis for case-control exome sequencing data.

Genet Epidemiol. 2014 Feb;38(2):104-13. doi: 10.1002/gepi.21783. Epub 2013 Dec 30.

On multi-marker tests for association in case-control studies.

Front Genet. 2013 Dec 16;4:252. doi: 10.3389/fgene.2013.00252. eCollection 2013.

Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation.

Eur J Hum Genet. 2013 Jun;21(6):680-6. doi: 10.1038/ejhg.2012.220. Epub 2012 Oct 24.

Reprioritizing genetic associations in hit regions using LASSO-based resample model averaging.

Genet Epidemiol. 2012 Jul;36(5):451-62. doi: 10.1002/gepi.21639. Epub 2012 Apr 30.

本文引用的文献

Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing.

Genet Epidemiol. 2011 May;35(4):211-6. doi: 10.1002/gepi.20567.

SNP selection in genome-wide and candidate gene studies via penalized logistic regression.

Genet Epidemiol. 2010 Dec;34(8):879-91. doi: 10.1002/gepi.20543.

Risk prediction using genome-wide association studies.

Genet Epidemiol. 2010 Nov;34(7):643-52. doi: 10.1002/gepi.20509.

Association screening of common and rare genetic variants by penalized regression.

Bioinformatics. 2010 Oct 1;26(19):2375-82. doi: 10.1093/bioinformatics/btq448. Epub 2010 Aug 6.

Hum Hered. 2010;70(2):109-31. doi: 10.1159/000312641. Epub 2010 Jul 3.

Hum Hered. 2010;70(2):132-40. doi: 10.1159/000312643. Epub 2010 Jul 3.

Powerful SNP-set analysis for case-control genome-wide association studies.

Am J Hum Genet. 2010 Jun 11;86(6):929-42. doi: 10.1016/j.ajhg.2010.05.002.

Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data.

Am J Hum Genet. 2010 Jun 11;86(6):860-71. doi: 10.1016/j.ajhg.2010.04.014.

A data-adaptive sum test for disease association with multiple common or rare variants.

Hum Hered. 2010;70(1):42-54. doi: 10.1159/000288704. Epub 2010 Apr 23.

A note on the effect on power of score tests via dimension reduction by penalized regression under the null.

Int J Biostat. 2010 Mar 29;6(1):Article 12. doi: 10.2202/1557-4679.1231.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于惩罚回归的多位点关联分析。

Multilocus association testing with penalized regression.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献