Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA.
Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana, 46202, USA.
Genet Epidemiol. 2022 Jul;46(5-6):317-340. doi: 10.1002/gepi.22461. Epub 2022 Jun 29.
Penalized variable selection for high-dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual levels. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program, we conduct G E study by using high-dimensional single nucleotide polymorphism data as genetic factors and the longitudinal trait, forced expiratory volume in 1 s, as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.
高维纵向数据的惩罚变量选择受到了广泛关注,因为它可以解释重复测量之间的相关性,同时为提高识别和预测性能提供额外的重要信息。尽管取得了成功,但在纵向研究中,惩罚方法的潜力远未被充分理解,无法适应结构稀疏性。本文提出了一种稀疏组惩罚方法,在重复测量表型下进行基因-环境(G-E)交互作用的双水平研究。在二次推断函数框架内,该方法可以同时在组和个体水平上识别主效应和交互效应。模拟研究表明,该方法优于主要竞争对手。在儿童哮喘管理计划中哮喘数据的案例研究中,我们使用高维单核苷酸多态性数据作为遗传因素,将纵向特征 1 秒用力呼气量作为表型,进行 G-E 研究。我们的方法提高了主要效应和交互效应的预测和识别能力,具有重要意义。