Okut Hayrettin, Gianola Daniel, Rosa Guilherme J M, Weigel Kent A
Department of Animal Sciences, University of Yuzuncy Yil, Van, 65080, Turkey.
Department of Dairy Science, University of Wisconsin, Madison, WI 53706, USA.
Genet Res (Camb). 2011 Jun;93(3):189-201. doi: 10.1017/S0016672310000662. Epub 2011 Apr 12.
Bayesian regularization of artificial neural networks (BRANNs) were used to predict body mass index (BMI) in mice using single nucleotide polymorphism (SNP) markers. Data from 1896 animals with both phenotypic and genotypic (12 320 loci) information were used for the analysis. Missing genotypes were imputed based on estimated allelic frequencies, with no attempt to reconstruct haplotypes based on family information or linkage disequilibrium between markers. A feed-forward multilayer perceptron network consisting of a single output layer and one hidden layer was used. Training of the neural network was done using the Bayesian regularized backpropagation algorithm. When the number of neurons in the hidden layer was increased, the number of effective parameters, γ, increased up to a point and stabilized thereafter. A model with five neurons in the hidden layer produced a value of γ that saturated the data. In terms of predictive ability, a network with five neurons in the hidden layer attained the smallest error and highest correlation in the test data although differences among networks were negligible. Using inherent weight information of BRANN with different number of neurons in the hidden layer, it was observed that 17 SNPs had a larger impact on the network, indicating their possible relevance in prediction of BMI. It is concluded that BRANN may be at least as useful as other methods for high-dimensional genome-enabled prediction, with the advantage of its potential ability of capturing non-linear relationships, which may be useful in the study of quantitative traits under complex gene action.
贝叶斯正则化人工神经网络(BRANNs)被用于利用单核苷酸多态性(SNP)标记预测小鼠的体重指数(BMI)。分析使用了来自1896只动物的具有表型和基因型(12320个位点)信息的数据。基于估计的等位基因频率对缺失的基因型进行插补,未尝试根据家系信息或标记间的连锁不平衡重建单倍型。使用了一个由单个输出层和一个隐藏层组成的前馈多层感知器网络。神经网络的训练使用贝叶斯正则化反向传播算法。当隐藏层中的神经元数量增加时,有效参数γ的数量增加到一定程度后趋于稳定。隐藏层中有五个神经元的模型产生的γ值使数据饱和。在预测能力方面,隐藏层中有五个神经元的网络在测试数据中误差最小且相关性最高,尽管各网络之间的差异可忽略不计。利用隐藏层中具有不同神经元数量的BRANN的固有权重信息,观察到17个SNP对网络有更大影响,表明它们在BMI预测中可能具有相关性。得出的结论是,BRANN在基于高维基因组的预测中可能至少与其他方法一样有用,其优势在于具有捕捉非线性关系的潜在能力,这在复杂基因作用下的数量性状研究中可能有用。