Ritchie Marylyn D, Motsinger Alison A, Bush William S, Coffey Christopher S, Moore Jason H
Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, Vanderbilt University, 519 Light Hall, Nashville, TN 37232.
Appl Soft Comput. 2007 Jan;7(1):471-479. doi: 10.1016/j.asoc.2006.01.013.
The identification of genes that influence the risk of common, complex disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. This challenge is partly due to the limitations of parametric statistical methods for detecting genetic effects that are dependent solely or partially on interactions. We have previously introduced a genetic programming neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of genetic and gene-environment combinations associated with disease risk. Previous empirical studies suggest GPNN has excellent power for identifying gene-gene and gene-environment interactions. The goal of this study was to compare the power of GPNN to stepwise logistic regression (SLR) and classification and regression trees (CART) for identifying gene-gene and gene-environment interactions. SLR and CART are standard methods of analysis for genetic association studies. Using simulated data, we show that GPNN has higher power to identify gene-gene and gene-environment interactions than SLR and CART. These results indicate that GPNN may be a useful pattern recognition approach for detecting gene-gene and gene-environment interactions in studies of human disease.
主要通过与其他基因及环境因素的相互作用来影响常见复杂疾病风险的基因识别,在遗传流行病学中仍是一项统计学和计算方面的挑战。这一挑战部分归因于参数统计方法在检测仅部分或完全依赖相互作用的遗传效应时存在局限性。我们之前引入了遗传编程神经网络(GPNN),作为一种优化神经网络架构以改进与疾病风险相关的基因及基因 - 环境组合识别的方法。先前的实证研究表明,GPNN在识别基因 - 基因和基因 - 环境相互作用方面具有出色的效能。本研究的目的是比较GPNN与逐步逻辑回归(SLR)和分类与回归树(CART)在识别基因 - 基因和基因 - 环境相互作用方面的效能。SLR和CART是遗传关联研究的标准分析方法。使用模拟数据,我们表明GPNN在识别基因 - 基因和基因 - 环境相互作用方面比SLR和CART具有更高的效能。这些结果表明,GPNN可能是在人类疾病研究中检测基因 - 基因和基因 - 环境相互作用的一种有用的模式识别方法。