Suppr超能文献

关于ℓ惩罚稀疏精度矩阵估计的不一致性

On the inconsistency of ℓ -penalised sparse precision matrix estimation.

作者信息

Heinävaara Otte, Leppä-Aho Janne, Corander Jukka, Honkela Antti

机构信息

Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland.

Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.

出版信息

BMC Bioinformatics. 2016 Dec 13;17(Suppl 16):448. doi: 10.1186/s12859-016-1309-x.

Abstract

BACKGROUND

Various ℓ -penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation and learning of undirected network structure from data. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation.

RESULTS

We explore the consistency of ℓ -based methods for a class of bipartite graphs motivated by the structure of models commonly used for gene regulatory networks. We show that all ℓ -based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and ℓ -based methods also become unreliable in practice for larger networks.

CONCLUSIONS

Our results demonstrate that ℓ -penalised undirected network structure learning methods are unable to reliably learn many sparse bipartite graph structures, which arise often in gene expression data. Users of such methods should be aware of the consistency criteria of the methods and check if they are likely to be met in their application of interest.

摘要

背景

各种ℓ惩罚估计方法,如图形套索法和CLIME,被广泛用于稀疏精度矩阵估计以及从数据中学习无向网络结构。这些方法中的许多已被证明在关于潜在真实协方差矩阵的各种定量假设下是一致的。直观地说,这些条件与惩罚项将主导优化的情况有关。

结果

我们探索了一类受基因调控网络常用模型结构启发的二分图的基于ℓ的方法的一致性。我们表明,对于变量之间具有近乎线性依赖性的模型,所有基于ℓ的方法都表现得非常糟糕。我们还研究了从真实基因表达数据导出的模型的一致性,并注意到即使对于中等规模的基因网络,一致性所需的假设也从未成立,并且对于更大的网络,基于ℓ的方法在实践中也变得不可靠。

结论

我们的结果表明,ℓ惩罚无向网络结构学习方法无法可靠地学习基因表达数据中经常出现的许多稀疏二分图结构。此类方法的用户应了解这些方法的一致性标准,并检查在他们感兴趣的应用中是否可能满足这些标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7964/5249033/a81e6dbf72ee/12859_2016_1309_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验