Yin Jianxin, Li Hongzhe
School of Statistics, Renmin University of China, No. 59 Zhongguancun Street, Haidian District, Beijing 100872, China and Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104-6021, USA.
J Multivar Anal. 2012 May 1;107:119-140. doi: 10.1016/j.jmva.2012.01.005.
Motivated by analysis of gene expression data measured over different tissues or over time, we consider matrix-valued random variable and matrix-normal distribution, where the precision matrices have a graphical interpretation for genes and tissues, respectively. We present a l(1) penalized likelihood method and an efficient coordinate descent-based computational algorithm for model selection and estimation in such matrix normal graphical models (MNGMs). We provide theoretical results on the asymptotic distributions, the rates of convergence of the estimates and the sparsistency, allowing both the numbers of genes and tissues to diverge as the sample size goes to infinity. Simulation results demonstrate that the MNGMs can lead to better estimate of the precision matrices and better identifications of the graph structures than the standard Gaussian graphical models. We illustrate the methods with an analysis of mouse gene expression data measured over ten different tissues.
受对不同组织或不同时间测量的基因表达数据进行分析的启发,我们考虑矩阵值随机变量和矩阵正态分布,其中精度矩阵分别对基因和组织具有图形解释。我们提出了一种基于 l(1) 惩罚似然的方法和一种基于坐标下降的高效计算算法,用于此类矩阵正态图形模型(MNGM)中的模型选择和估计。我们给出了关于渐近分布、估计收敛速度和稀疏一致性的理论结果,允许基因数量和组织数量随着样本量趋于无穷而发散。模拟结果表明,与标准高斯图形模型相比,MNGM 能够对精度矩阵进行更好的估计,并能更好地识别图形结构。我们通过对在十个不同组织上测量的小鼠基因表达数据进行分析来说明这些方法。