高维混合模型中基因及基因与环境交互作用效应的分层选择

Hierarchical selection of genetic and gene by environment interaction effects in high-dimensional mixed models.

作者信息

St-Pierre Julien, Oualkacha Karim, Rai Bhatnagar Sahir

机构信息

Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.

Département de Mathématiques, Faculté des Sciences, Université du Québec à Montréal, Montreal, QC, Canada.

出版信息

Stat Methods Med Res. 2025 Jan;34(1):180-198. doi: 10.1177/09622802241293768. Epub 2024 Dec 10.

DOI:10.1177/09622802241293768

PMID:39659138

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11800719/

Abstract

Interactions between genes and environmental factors may play a key role in the etiology of many common disorders. Several regularized generalized linear models have been proposed for hierarchical selection of gene by environment interaction effects, where a gene-environment interaction effect is selected only if the corresponding genetic main effect is also selected in the model. However, none of these methods allow to include random effects to account for population structure, subject relatedness and shared environmental exposure. In this article, we develop a unified approach based on regularized penalized quasi-likelihood estimation to perform hierarchical selection of gene-environment interaction effects in sparse regularized mixed models. We compare the selection and prediction accuracy of our proposed model with existing methods through simulations under the presence of population structure and shared environmental exposure. We show that for all simulation scenarios, including and additional random effect to account for the shared environmental exposure reduces the false positive rate and false discovery rate of our proposed method for selection of both gene-environment interaction and main effects. Using the score as a balanced measure of the false discovery rate and true positive rate, we further show that in the hierarchical simulation scenarios, our method outperforms other methods for retrieving important gene-environment interaction effects. Finally, we apply our method to a real data application using the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, and found that our method retrieves previously reported significant loci.

摘要

基因与环境因素之间的相互作用可能在许多常见疾病的病因学中起关键作用。已经提出了几种正则化广义线性模型用于通过基因 - 环境相互作用效应进行基因的分层选择，其中仅当模型中也选择了相应的基因主效应时才选择基因 - 环境相互作用效应。然而，这些方法都不允许纳入随机效应来解释群体结构、个体相关性和共享环境暴露。在本文中，我们开发了一种基于正则化惩罚拟似然估计的统一方法，以在稀疏正则化混合模型中对基因 - 环境相互作用效应进行分层选择。我们通过在存在群体结构和共享环境暴露的情况下进行模拟，将我们提出的模型的选择和预测准确性与现有方法进行比较。我们表明，对于所有模拟场景，包括纳入一个额外的随机效应来解释共享环境暴露，都降低了我们提出的用于选择基因 - 环境相互作用和主效应的方法的假阳性率和错误发现率。使用得分作为错误发现率和真阳性率的平衡度量，我们进一步表明，在分层模拟场景中，我们的方法在检索重要的基因 - 环境相互作用效应方面优于其他方法。最后，我们将我们的方法应用于使用口面部疼痛：前瞻性评估和风险评估（OPPERA）研究的真实数据应用中，发现我们的方法检索到了先前报道的显著位点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b83/11800719/ef39563769d8/10.1177_09622802241293768-fig1.jpg

相似文献

Hierarchical selection of genetic and gene by environment interaction effects in high-dimensional mixed models.高维混合模型中基因及基因与环境交互作用效应的分层选择

Stat Methods Med Res. 2025 Jan;34(1):180-198. doi: 10.1177/09622802241293768. Epub 2024 Dec 10.

Simultaneous estimation of gene-gene and gene-environment interactions for numerous loci using double penalized log-likelihood.使用双重惩罚对数似然法同时估计多个位点的基因-基因和基因-环境相互作用。

Genet Epidemiol. 2006 Dec;30(8):645-51. doi: 10.1002/gepi.20176.

Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法

Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.

Bayesian variable selection for hierarchical gene-environment and gene-gene interactions.用于分层基因-环境和基因-基因相互作用的贝叶斯变量选择

Hum Genet. 2015 Jan;134(1):23-36. doi: 10.1007/s00439-014-1478-5. Epub 2014 Aug 26.

Semiparametric Bayesian variable selection for gene-environment interactions.用于基因-环境相互作用的半参数贝叶斯变量选择

Stat Med. 2020 Feb 28;39(5):617-638. doi: 10.1002/sim.8434. Epub 2019 Dec 21.

Detection of gene-environment interactions in a family-based population using SCAD.使用SCAD在基于家庭的人群中检测基因-环境相互作用。

Stat Med. 2017 Sep 30;36(22):3547-3559. doi: 10.1002/sim.7382. Epub 2017 Jul 13.

Favoring the hierarchical constraint in penalized survival models for randomized trials in precision medicine.在精准医学中，为了随机试验的惩罚生存模型，赞成层次约束。

BMC Bioinformatics. 2023 Mar 16;24(1):96. doi: 10.1186/s12859-023-05162-x.

Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data.高效惩罚广义线性混合模型在高维数据中的变量选择和遗传风险预测。

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad063.

A unified method for rare variant analysis of gene-environment interactions.一种用于基因-环境相互作用罕见变异分析的统一方法。

Stat Med. 2020 Mar 15;39(6):801-813. doi: 10.1002/sim.8446. Epub 2019 Dec 4.

A penalized robust semiparametric approach for gene-environment interactions.一种用于基因-环境相互作用的惩罚稳健半参数方法。

Stat Med. 2015 Dec 30;34(30):4016-30. doi: 10.1002/sim.6609. Epub 2015 Aug 3.

本文引用的文献

A harmonized public resource of deeply sequenced diverse human genomes.一个深度测序的多样化人类基因组的协调公共资源。

Genome Res. 2024 Jun 25;34(5):796-809. doi: 10.1101/gr.278378.123.

Pathological imaging-assisted cancer gene-environment interaction analysis.基于病理成像的癌症基因-环境交互作用分析。

Biometrics. 2023 Dec;79(4):3883-3894. doi: 10.1111/biom.13873. Epub 2023 May 17.

A scalable hierarchical lasso for gene-environment interactions.一种用于基因-环境相互作用的可扩展分层套索法。

J Comput Graph Stat. 2022;31(4):1091-1103. doi: 10.1080/10618600.2022.2039161. Epub 2022 Mar 31.

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad063.

A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank.一种快速且可扩展的大规模超高维稀疏回归框架及其在 UK Biobank 中的应用。

PLoS Genet. 2020 Oct 23;16(10):e1009141. doi: 10.1371/journal.pgen.1009141. eCollection 2020 Oct.

MM ALGORITHMS FOR VARIANCE COMPONENT ESTIMATION AND SELECTION IN LOGISTIC LINEAR MIXED MODEL.逻辑线性混合模型中方差分量估计与选择的MM算法

Stat Sin. 2019;29(3):1585-1605. doi: 10.5705/ss.202017.0220.

Simultaneous SNP selection and adjustment for population structure in high dimensional prediction models.高维预测模型中同时进行 SNP 选择和群体结构调整。

PLoS Genet. 2020 May 4;16(5):e1008766. doi: 10.1371/journal.pgen.1008766. eCollection 2020 May.

Interactions between a Polygenic Risk Score and Non-genetic Risk Factors in Young-Onset Breast Cancer.多基因风险评分与早发性乳腺癌非遗传风险因素之间的相互作用。

Sci Rep. 2020 Feb 24;10(1):3242. doi: 10.1038/s41598-020-60032-3.

A resource-efficient tool for mixed model association analysis of large-scale data.一种资源高效的工具，用于大规模数据的混合模型关联分析。

Nat Genet. 2019 Dec;51(12):1749-1755. doi: 10.1038/s41588-019-0530-8. Epub 2019 Nov 25.

MM Algorithms For Variance Components Models.方差分量模型的MM算法

J Comput Graph Stat. 2019;28(2):350-361. doi: 10.1080/10618600.2018.1529601. Epub 2019 Mar 9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

高维混合模型中基因及基因与环境交互作用效应的分层选择

Hierarchical selection of genetic and gene by environment interaction effects in high-dimensional mixed models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献