关于ℓ惩罚稀疏精度矩阵估计的不一致性

On the inconsistency of ℓ -penalised sparse precision matrix estimation.

作者信息

Heinävaara Otte, Leppä-Aho Janne, Corander Jukka, Honkela Antti

机构信息

Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland.

Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.

出版信息

BMC Bioinformatics. 2016 Dec 13;17(Suppl 16):448. doi: 10.1186/s12859-016-1309-x.

DOI:10.1186/s12859-016-1309-x

PMID:28105909

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5249033/

Abstract

BACKGROUND

Various ℓ -penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation and learning of undirected network structure from data. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation.

RESULTS

We explore the consistency of ℓ -based methods for a class of bipartite graphs motivated by the structure of models commonly used for gene regulatory networks. We show that all ℓ -based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and ℓ -based methods also become unreliable in practice for larger networks.

CONCLUSIONS

Our results demonstrate that ℓ -penalised undirected network structure learning methods are unable to reliably learn many sparse bipartite graph structures, which arise often in gene expression data. Users of such methods should be aware of the consistency criteria of the methods and check if they are likely to be met in their application of interest.

摘要

背景

各种ℓ惩罚估计方法，如图形套索法和CLIME，被广泛用于稀疏精度矩阵估计以及从数据中学习无向网络结构。这些方法中的许多已被证明在关于潜在真实协方差矩阵的各种定量假设下是一致的。直观地说，这些条件与惩罚项将主导优化的情况有关。

结果

我们探索了一类受基因调控网络常用模型结构启发的二分图的基于ℓ的方法的一致性。我们表明，对于变量之间具有近乎线性依赖性的模型，所有基于ℓ的方法都表现得非常糟糕。我们还研究了从真实基因表达数据导出的模型的一致性，并注意到即使对于中等规模的基因网络，一致性所需的假设也从未成立，并且对于更大的网络，基于ℓ的方法在实践中也变得不可靠。

结论

我们的结果表明，ℓ惩罚无向网络结构学习方法无法可靠地学习基因表达数据中经常出现的许多稀疏二分图结构。此类方法的用户应了解这些方法的一致性标准，并检查在他们感兴趣的应用中是否可能满足这些标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7964/5249033/a81e6dbf72ee/12859_2016_1309_Fig1_HTML.jpg

相似文献

On the inconsistency of ℓ -penalised sparse precision matrix estimation.关于ℓ惩罚稀疏精度矩阵估计的不一致性

BMC Bioinformatics. 2016 Dec 13;17(Suppl 16):448. doi: 10.1186/s12859-016-1309-x.

Regularized estimation of large-scale gene association networks using graphical Gaussian models.基于图式高斯模型的大规模基因关联网络正则化估计

BMC Bioinformatics. 2009 Nov 24;10:384. doi: 10.1186/1471-2105-10-384.

Sparse Inverse Covariance Estimation with L0 Penalty for Network Construction with Omics Data.用于组学数据网络构建的具有L0惩罚的稀疏逆协方差估计

J Comput Biol. 2016 Mar;23(3):192-202. doi: 10.1089/cmb.2015.0102. Epub 2016 Feb 1.

Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data.基于微阵列数据的大型基因网络估计的图形高斯建模中的加权套索法

Genome Inform. 2007;19:142-53.

Statistical completion of a partially identified graph with applications for the estimation of gene regulatory networks.部分识别图的统计完备性及其在基因调控网络估计中的应用

Biostatistics. 2015 Oct;16(4):670-85. doi: 10.1093/biostatistics/kxv013. Epub 2015 Apr 1.

A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data.一种用于从转录谱数据估计稀疏线性遗传网络结构的线性规划方法。

Algorithms Mol Biol. 2009 Feb 24;4:5. doi: 10.1186/1748-7188-4-5.

A boosting approach to structure learning of graphs with and without prior knowledge.基于提升方法的有向和无向图结构学习

Bioinformatics. 2009 Nov 15;25(22):2929-36. doi: 10.1093/bioinformatics/btp485. Epub 2009 Aug 20.

Learning gene regulatory networks using gaussian process emulator and graphical LASSO.使用高斯过程仿真器和图形 LASSO 学习基因调控网络。

J Bioinform Comput Biol. 2021 Jun;19(3):2150007. doi: 10.1142/S0219720021500074. Epub 2021 Apr 30.

Learning directed acyclic graphical structures with genetical genomics data.利用遗传基因组学数据学习有向无环图结构

Bioinformatics. 2015 Dec 15;31(24):3953-60. doi: 10.1093/bioinformatics/btv513. Epub 2015 Sep 2.

Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.用于估计稀疏高维有向无环图的惩罚似然方法。

Biometrika. 2010 Sep;97(3):519-538. doi: 10.1093/biomet/asq038. Epub 2010 Jul 9.

引用本文的文献

Back to the basics: Rethinking partial correlation network methodology.回归基础：重新思考偏相关网络方法。

Br J Math Stat Psychol. 2020 May;73(2):187-212. doi: 10.1111/bmsp.12173. Epub 2019 Jun 17.

A Combined PLS and Negative Binomial Regression Model for Inferring Association Networks from Next-Generation Sequencing Count Data.基于 PLS 和负二项回归模型的下一代测序计数数据分析关联网络推断

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):760-773. doi: 10.1109/TCBB.2017.2665495. Epub 2017 Feb 7.

Selected proceedings of Machine Learning in Systems Biology: MLSB 2016.《系统生物学中的机器学习：2016年MLSB会议论文选集》

BMC Bioinformatics. 2016 Dec 13;17(Suppl 16):437. doi: 10.1186/s12859-016-1305-1.

本文引用的文献

An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network.一个经实验支持的枯草芽孢杆菌全局转录调控网络模型。

Mol Syst Biol. 2015 Nov 17;11(11):839. doi: 10.15252/msb.20156236.

Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions.

J Multivar Anal. 2015 Mar 1;135:153-162. doi: 10.1016/j.jmva.2014.11.005.

Comprehensive molecular portraits of human breast tumours.人类乳腺肿瘤的全面分子特征图谱。

Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23.

Gene regulatory networks from multifactorial perturbations using Graphical Lasso: application to the DREAM4 challenge.使用图形lasso 从多因素扰动中构建基因调控网络：在 DREAM4 挑战中的应用。

PLoS One. 2010 Dec 20;5(12):e14147. doi: 10.1371/journal.pone.0014147.

Partial Correlation Estimation by Joint Sparse Regression Models.基于联合稀疏回归模型的偏相关估计

J Am Stat Assoc. 2009 Jun 1;104(486):735-746. doi: 10.1198/jasa.2009.0126.

Sparse inverse covariance estimation with the graphical lasso.使用图模型选择法进行稀疏逆协方差估计。

Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.

An Arabidopsis gene network based on the graphical Gaussian model.基于图形高斯模型的拟南芥基因网络。

Genome Res. 2007 Nov;17(11):1614-25. doi: 10.1101/gr.6911207. Epub 2007 Oct 5.

Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities.转录因子浓度和基因特异性调控活性的概率推断

Bioinformatics. 2006 Nov 15;22(22):2775-81. doi: 10.1093/bioinformatics/btl473. Epub 2006 Sep 11.

Bayesian sparse hidden components analysis for transcription regulation networks.用于转录调控网络的贝叶斯稀疏隐藏成分分析

Bioinformatics. 2006 Mar 15;22(6):739-46. doi: 10.1093/bioinformatics/btk017. Epub 2005 Dec 20.

Network component analysis: reconstruction of regulatory signals in biological systems.网络组件分析：生物系统中调控信号的重建

Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15522-7. doi: 10.1073/pnas.2136632100. Epub 2003 Dec 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关于ℓ惩罚稀疏精度矩阵估计的不一致性

On the inconsistency of ℓ -penalised sparse precision matrix estimation.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献