Shimamura Teppei, Imoto Seiya, Yamaguchi Rui, Miyano Satoru
Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.
Genome Inform. 2007;19:142-53.
We propose a statistical method based on graphical Gaussian models for estimating large gene networks from DNA microarray data. In estimating large gene networks, the number of genes is larger than the number of samples, we need to consider some restrictions for model building. We propose weighted lasso estimation for the graphical Gaussian models as a model of large gene networks. In the proposed method, the structural learning for gene networks is equivalent to the selection of the regularization parameters included in the weighted lasso estimation. We investigate this problem from a Bayes approach and derive an empirical Bayesian information criterion for choosing them. Unlike Bayesian network approach, our method can find the optimal network structure and does not require to use heuristic structural learning algorithm. We conduct Monte Carlo simulation to show the effectiveness of the proposed method. We also analyze Arabidopsis thaliana microarray data and estimate gene networks.
我们提出了一种基于图形高斯模型的统计方法,用于从DNA微阵列数据中估计大型基因网络。在估计大型基因网络时,基因数量大于样本数量,我们需要在模型构建时考虑一些限制条件。我们提出将加权套索估计用于图形高斯模型,作为大型基因网络的一种模型。在所提出的方法中,基因网络的结构学习等同于加权套索估计中正则化参数的选择。我们从贝叶斯方法的角度研究这个问题,并推导了一个用于选择这些参数的经验贝叶斯信息准则。与贝叶斯网络方法不同,我们的方法可以找到最优的网络结构,并且不需要使用启发式结构学习算法。我们进行了蒙特卡罗模拟以证明所提方法的有效性。我们还分析了拟南芥微阵列数据并估计了基因网络。