Suppr超能文献

评估复杂网络中的低强度关系。

Assessing Low-Intensity Relationships in Complex Networks.

作者信息

Spitz Andreas, Gimmler Anna, Stoeck Thorsten, Zweig Katharina Anna, Horvát Emőke-Ágnes

机构信息

Institute of Computer Science, Heidelberg University, Heidelberg, BW, Germany.

Department of Ecology, University of Kaiserslautern, Kaiserslautern, RP, Germany.

出版信息

PLoS One. 2016 Apr 20;11(4):e0152536. doi: 10.1371/journal.pone.0152536. eCollection 2016.

Abstract

Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes.

摘要

许多大型网络数据集存在噪声,包含代表低强度关系的链接,这些关系难以与随机相互作用区分开来。这对于来自系统生物学的高通量数据、大规模生态数据,以及关于人类互动的Web 2.0数据尤为重要。在这些存在缺失链接和虚假链接的网络中,可以基于结构相似性原则对数据进行优化,该原则评估两个节点的共享邻域。通过使用相似性度量对所有可能的链接进行全局排序并选择排名靠前的对,可以验证真实链接、推断缺失链接并去除虚假观测值。虽然为此已经提出了许多相似性度量,但对于使用哪一种并没有普遍的共识。在本文中,我们首先为来自三种不同设置(电子商务、系统生物学和社交网络)的复杂网络贡献了一组基准,从而能够对经典节点相似性度量进行定量性能分析。在此基础上,我们然后提出了一种名为z*的新的链接评估方法,该方法通过与适当选择的随机图模型中的期望值进行比较来评估其共同邻居数量的统计显著性,并且是所有基准中始终表现最佳的算法。除了对链接进行全局排序外,我们还使用此方法在局部排序中识别每个单个节点的最相似邻居,从而展示该方法在两种不同场景中的通用性并增强其适用性。最后,我们对一个海洋浮游生物数据集进行了探索性分析,发现微生物的分布遵循与大型生物相似的生物地理规则,这一结果否定了微生物的全球扩散假说。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c69/4838277/9640272058ab/pone.0152536.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验