Ritchie Marylyn D, White Bill C, Parker Joel S, Hahn Lance W, Moore Jason H
Program in Human Genetics and Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School, Nashville, TN, 37232-0700, USA.
BMC Bioinformatics. 2003 Jul 7;4:28. doi: 10.1186/1471-2105-4-28.
Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases.
Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present.
This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases.
在数据分析之前对神经网络架构进行恰当定义对于成功的数据挖掘至关重要。当数据的潜在模型未知时,这可能具有挑战性。本研究的目的是确定使用遗传编程作为机器学习策略来优化神经网络架构是否会提高神经网络在常见人类疾病研究中对基因间非线性相互作用进行建模和检测的能力。
使用模拟数据,我们表明遗传编程优化的神经网络方法能够对基因-基因相互作用进行建模,其效果与传统反向传播神经网络相当。此外,在存在无功能多态性的情况下,遗传编程优化的神经网络在预测能力和检测基因-基因相互作用的能力方面优于传统反向传播神经网络方法。
本研究表明,在常见复杂人类疾病中识别和表征基因-基因相互作用时,一种用于优化神经网络架构的机器学习策略可能比传统的试错方法更可取。