Tatarinova Tatiana, Bouck John, Schumitzky Alan
Department of Mathematics, Loyola Marymount University, Los Angeles, CA 90045, USA.
J Bioinform Comput Biol. 2008 Aug;6(4):727-46. doi: 10.1142/s0219720008003710.
In this paper, we study Bayesian analysis of nonlinear hierarchical mixture models with a finite but unknown number of components. Our approach is based on Markov chain Monte Carlo (MCMC) methods. One of the applications of our method is directed to the clustering problem in gene expression analysis. From a mathematical and statistical point of view, we discuss the following topics: theoretical and practical convergence problems of the MCMC method; determination of the number of components in the mixture; and computational problems associated with likelihood calculations. In the existing literature, these problems have mainly been addressed in the linear case. One of the main contributions of this paper is developing a method for the nonlinear case. Our approach is based on a combination of methods including Gibbs sampling, random permutation sampling, birth-death MCMC, and Kullback-Leibler distance.
在本文中,我们研究具有有限但未知数量成分的非线性分层混合模型的贝叶斯分析。我们的方法基于马尔可夫链蒙特卡罗(MCMC)方法。我们方法的应用之一针对基因表达分析中的聚类问题。从数学和统计学角度,我们讨论以下主题:MCMC方法的理论和实际收敛问题;混合模型中成分数量的确定;以及与似然计算相关的计算问题。在现有文献中,这些问题主要是在线性情况下解决的。本文的主要贡献之一是为非线性情况开发了一种方法。我们的方法基于包括吉布斯采样、随机排列采样、生死MCMC和库尔贝克-莱布勒距离等方法的组合。