UIC School of Public Health, Division of Epidemiology and Biostatistics, (MC 923), 1603 West Taylor Street, 987 SPHPI, Chicago, IL 60612, USA.
Genomics. 2013 Oct;102(4):189-94. doi: 10.1016/j.ygeno.2013.08.006. Epub 2013 Aug 29.
High-throughput cancer studies have been extensively conducted, searching for genetic markers associated with outcomes beyond clinical and environmental risk factors. Gene-environment interactions can have important implications beyond main effects. The commonly-adopted single-marker analysis cannot accommodate the joint effects of a large number of markers. The existing joint-effects methods also have limitations. Specifically, they may suffer from high computational cost, do not respect the "main effect, interaction" hierarchical structure, or use ineffective techniques. We develop a penalization method for the identification of important G × E interactions and main effects. It has an intuitive formulation, respects the hierarchical structure, accommodates the joint effects of multiple markers, and is computationally affordable. In numerical study, we analyze prognosis data under the AFT (accelerated failure time) model. Simulation shows satisfactory performance of the proposed method. Analysis of an NHL (non-Hodgkin lymphoma) study with SNP measurements shows that the proposed method identifies markers with important implications and satisfactory prediction performance.
高通量癌症研究已经广泛开展,旨在寻找与临床和环境风险因素以外的结果相关的遗传标记物。基因-环境相互作用除了主要效应之外还有重要的意义。常用的单标记分析方法无法适应大量标记物的联合效应。现有的联合效应方法也存在局限性。具体来说,它们可能存在计算成本高、不尊重“主效应,交互作用”层次结构或使用无效技术等问题。我们开发了一种用于识别重要 G×E 相互作用和主效应的惩罚方法。它具有直观的公式,尊重层次结构,适应多个标记物的联合效应,并且计算成本合理。在数值研究中,我们在 AFT(加速失效时间)模型下分析预后数据。模拟结果表明,所提出的方法具有令人满意的性能。对 SNP 测量的 NHL(非霍奇金淋巴瘤)研究的分析表明,所提出的方法可以识别具有重要意义和令人满意的预测性能的标记物。