Li Bing, Chuns Hyonho, Zhao Hongyu
Professor of Statistics, The Pennsylvania State University, 326 Thomas Building, University Park, PA 16802.
Assistant Professor of Statistics, Purdue University, 250 N. University Street, West Lafayette, IN 47907.
J Am Stat Assoc. 2012 Jan 1;107(497):152-167. doi: 10.1080/01621459.2011.644498.
In many applications the graph structure in a network arises from two sources: intrinsic connections and connections due to external effects. We introduce a sparse estimation procedure for graphical models that is capable of isolating the intrinsic connections by removing the external effects. Technically, this is formulated as a graphical model, in which the external effects are modeled as predictors, and the graph is determined by the conditional precision matrix. We introduce two sparse estimators of this matrix using the reproduced kernel Hilbert space combined with lasso and adaptive lasso. We establish the sparsity, variable selection consistency, oracle property, and the asymptotic distributions of the proposed estimators. We also develop their convergence rate when the dimension of the conditional precision matrix goes to infinity. The methods are compared with sparse estimators for unconditional graphical models, and with the constrained maximum likelihood estimate that assumes a known graph structure. The methods are applied to a genetic data set to construct a gene network conditioning on single-nucleotide polymorphisms.
在许多应用中,网络中的图结构源于两个来源:内在连接和外部效应导致的连接。我们为图形模型引入了一种稀疏估计程序,该程序能够通过消除外部效应来分离内在连接。从技术上讲,这被表述为一个图形模型,其中外部效应被建模为预测变量,并且图由条件精度矩阵确定。我们使用再生核希尔伯特空间结合套索和自适应套索引入了该矩阵的两个稀疏估计器。我们建立了所提出估计器的稀疏性、变量选择一致性、神谕性质和渐近分布。当条件精度矩阵的维度趋于无穷大时,我们还推导了它们的收敛速度。将这些方法与无条件图形模型的稀疏估计器以及假设已知图结构的约束最大似然估计进行了比较。这些方法被应用于一个遗传数据集,以构建基于单核苷酸多态性的基因网络。