估计扩散网络结构：恢复条件、样本复杂度与软阈值算法。

Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm.

作者信息

Daneshmand Hadi, Gomez-Rodriguez Manuel, Song Le, Schölkopf Bernhard

机构信息

MPI for Intelligent Systems.

Georgia Institute of Technology.

出版信息

JMLR Workshop Conf Proc. 2014 Jun;32(2):793-801.

PMID:25932466

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4412853/

Abstract

Information spreads across social and technological networks, but often the network structures are hidden from us and we only observe the traces left by the diffusion processes, called . Can we recover the hidden network structures from these observed cascades? What kind of cascades and how many cascades do we need? Are there some network structures which are more difficult than others to recover? Can we design efficient inference algorithms with provable guarantees? Despite the increasing availability of cascade-data and methods for inferring networks from these data, a thorough theoretical understanding of the above questions remains largely unexplored in the literature. In this paper, we investigate the network structure inference problem for a general family of continuous-time diffusion models using an [Formula: see text]-regularized likelihood maximization framework. We show that, as long as the cascade sampling process satisfies a natural incoherence condition, our framework can recover the correct network structure with high probability if we observe ( log ) cascades, where is the maximum number of parents of a node and is the total number of nodes. Moreover, we develop a simple and efficient soft-thresholding inference algorithm, which we use to illustrate the consequences of our theoretical results, and show that our framework outperforms other alternatives in practice.

摘要

信息在社会和技术网络中传播，但网络结构往往对我们隐藏，我们只能观察到扩散过程留下的痕迹，称为。我们能否从这些观察到的级联中恢复隐藏的网络结构？我们需要什么样的级联以及多少级联？是否存在某些网络结构比其他结构更难恢复？我们能否设计出具有可证明保证的高效推理算法？尽管级联数据的可用性不断增加，以及从这些数据推断网络的方法不断涌现，但在文献中，对上述问题的全面理论理解在很大程度上仍未得到探索。在本文中，我们使用一个[公式：见正文]正则化似然最大化框架，研究了一类通用的连续时间扩散模型的网络结构推断问题。我们表明，只要级联采样过程满足一个自然的非相干条件，那么如果我们观察到（对数）级联，我们的框架就可以以高概率恢复正确的网络结构，其中是节点的最大父节点数，是节点总数。此外，我们开发了一种简单高效的软阈值推理算法，用于说明我们理论结果的影响，并表明我们的框架在实践中优于其他方法。