Liu Zhenqiu, Lin Shili, Deng Nan, McGovern Dermot P B, Piantadosi Steven
1 Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center , Los Angeles, California.
2 Department of Statistics, The Ohio State University , Columbus, Ohio.
J Comput Biol. 2016 Mar;23(3):192-202. doi: 10.1089/cmb.2015.0102. Epub 2016 Feb 1.
Constructing coexpression and association networks with omics data is crucial for studying gene-gene interactions and underlying biological mechanisms. In recent years, learning the structure of a Gaussian graphical model from high-dimensional data using L1 penalty has been well-studied and many applications in bioinformatics and computational biology have been found. However, besides the problem of biased estimators with LASSO, L1 does not always choose the true model consistently. Based on our previous work with L0 regularized regression (Liu and Li, 2014), we propose an L0 regularized sparse inverse covariance estimation (L0RICE) for structure learning with the efficient alternating direction (AD) method. The proposed method is robust and has the oracle property. The proposed method is applied to omics data including data, from next-generation sequencing technologies. Novel procedures for network construction and high-order gene-gene interaction detection with omics data are developed. Results from simulation and real omics data analysis indicate that L0 regularized structure learning can identify high-order correlation structure with lower false positive rate and outperform graphical lasso by a large margin.
利用组学数据构建共表达和关联网络对于研究基因-基因相互作用及潜在生物学机制至关重要。近年来,使用L1惩罚从高维数据学习高斯图形模型的结构已得到充分研究,并在生物信息学和计算生物学中有许多应用。然而,除了LASSO存在有偏估计量的问题外,L1并不总是一致地选择真实模型。基于我们之前关于L0正则化回归的工作(Liu和Li,2014),我们提出了一种用于结构学习的L0正则化稀疏逆协方差估计(L0RICE),采用高效的交替方向(AD)方法。所提出的方法具有鲁棒性且具有神谕性质。该方法应用于包括来自下一代测序技术的数据在内的组学数据。开发了用于组学数据网络构建和高阶基因-基因相互作用检测的新程序。模拟和实际组学数据分析结果表明,L0正则化结构学习能够以较低的假阳性率识别高阶相关结构,并且在很大程度上优于图形拉索法。