School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China.
Department of Biostatistics, Yale University, New Haven, Connecticut.
Biometrics. 2020 Mar;76(1):23-35. doi: 10.1111/biom.13139. Epub 2019 Oct 9.
For the etiology, progression, and treatment of complex diseases, gene-environment (G-E) interactions have important implications beyond the main G and E effects. G-E interaction analysis can be more challenging with higher dimensionality and need for accommodating the "main effects, interactions" hierarchy. In recent literature, an array of novel methods, many of which are based on the penalization technique, have been developed. In most of these studies, however, the structures of G measurements, for example, the adjacency structure of single nucleotide polymorphisms (SNPs; attributable to their physical adjacency on the chromosomes) and the network structure of gene expressions (attributable to their coordinated biological functions and correlated measurements) have not been well accommodated. In this study, we develop structured G-E interaction analysis, where such structures are accommodated using penalization for both the main G effects and interactions. Penalization is also applied for regularized estimation and selection. The proposed structured interaction analysis can be effectively realized. It is shown to have consistency properties under high-dimensional settings. Simulations and analysis of GENEVA diabetes data with SNP measurements and TCGA melanoma data with gene expression measurements demonstrate its competitive practical performance.
对于复杂疾病的病因、进展和治疗,基因-环境(G-E)相互作用除了主要的 G 和 E 效应之外,还有重要的意义。随着维度的增加和需要适应“主要效应、相互作用”层次结构,G-E 相互作用分析可能更具挑战性。最近的文献中已经开发出了一系列新的方法,其中许多方法都是基于惩罚技术。然而,在这些研究中的大多数情况下,G 测量的结构,例如单核苷酸多态性(SNP;归因于它们在染色体上的物理邻近性)的邻接结构和基因表达的网络结构(归因于它们协调的生物功能和相关测量)并没有得到很好的适应。在这项研究中,我们开发了结构 G-E 相互作用分析,其中使用惩罚来适应主要 G 效应和相互作用。惩罚也适用于正则化估计和选择。所提出的结构交互分析可以有效地实现。在高维设置下,它具有一致性特性。对 GENEVA 糖尿病数据中的 SNP 测量和 TCGA 黑色素瘤数据中的基因表达测量进行的模拟和分析表明,它具有有竞争力的实际性能。