Suppr超能文献

堆叠模型以实现复杂网络中近乎最优的链路预测。

Stacking models for nearly optimal link prediction in complex networks.

机构信息

Department of Computer Science, University of Colorado, Boulder, CO 80309;

Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292.

出版信息

Proc Natl Acad Sci U S A. 2020 Sep 22;117(38):23393-23400. doi: 10.1073/pnas.1914950117. Epub 2020 Sep 4.

Abstract

Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speed up network data collection and improve network model validation. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 550 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity using network-based metalearning to construct a series of "stacked" models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state of the art for link prediction comes from combining individual algorithms, which can achieve nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvements.

摘要

大多数真实世界的网络都是不完全观测到的。能够准确预测哪些链接缺失的算法可以显著加快网络数据收集和改进网络模型验证。现在存在许多用于预测缺失链接的算法,给定一个部分观测到的网络,但仍然不知道是否存在单个最佳预测器,不同方法和来自不同领域的网络的链接可预测性如何变化,以及当前方法接近最优的程度。我们通过系统地评估 203 个单独的链接预测算法,代表三种流行的方法家族,应用于来自六个科学领域的 550 个结构多样的网络的大型语料库,回答了这些问题。我们首先表明,个别算法表现出广泛的预测误差多样性,因此在所有现实输入中,没有一个预测器或方法家族是最好的或最差的。然后,我们利用基于网络的元学习利用这种多样性,构建一系列将预测器组合成单个算法的“堆叠”模型。将这些堆叠模型应用于广泛的合成网络,对于这些网络,我们可以通过分析计算出最佳性能,这些堆叠模型可以达到最佳或几乎最佳的准确性水平。将这些堆叠模型应用于真实网络,它们的性能优于传统模型,但准确性因领域而异,这表明在社交网络中,链接预测可能比在生物或技术网络中更基本。这些结果表明,链接预测的最新技术来自于结合单个算法,这些算法可以实现几乎最佳的预测。最后,我们简要讨论了进一步改进的限制和机会。

相似文献

1
Stacking models for nearly optimal link prediction in complex networks.堆叠模型以实现复杂网络中近乎最优的链路预测。
Proc Natl Acad Sci U S A. 2020 Sep 22;117(38):23393-23400. doi: 10.1073/pnas.1914950117. Epub 2020 Sep 4.
10
Latent feature kernels for link prediction on sparse graphs.基于潜在特征核的稀疏图链路预测。
IEEE Trans Neural Netw Learn Syst. 2012 Nov;23(11):1793-804. doi: 10.1109/TNNLS.2012.2215337.

引用本文的文献

4
Inconsistency among evaluation metrics in link prediction.链接预测中评估指标之间的不一致性。
PNAS Nexus. 2024 Nov 6;3(11):pgae498. doi: 10.1093/pnasnexus/pgae498. eCollection 2024 Nov.
5
Network community detection via neural embeddings.通过神经嵌入进行网络社区检测。
Nat Commun. 2024 Nov 1;15(1):9446. doi: 10.1038/s41467-024-52355-w.
7
The maximum capability of a topological feature in link prediction.链路预测中拓扑特征的最大能力。
PNAS Nexus. 2024 Mar 13;3(3):pgae113. doi: 10.1093/pnasnexus/pgae113. eCollection 2024 Mar.
9
Link prediction using low-dimensional node embeddings: The measurement problem.使用低维节点嵌入的链接预测:测量问题。
Proc Natl Acad Sci U S A. 2024 Feb 20;121(8):e2312527121. doi: 10.1073/pnas.2312527121. Epub 2024 Feb 16.

本文引用的文献

3
The ground truth about metadata and community detection in networks.网络中关于元数据和社区检测的真相。
Sci Adv. 2017 May 3;3(5):e1602548. doi: 10.1126/sciadv.1602548. eCollection 2017 May.
5
Link-Prediction Enhanced Consensus Clustering for Complex Networks.复杂网络的链接预测增强共识聚类
PLoS One. 2016 May 20;11(5):e0153384. doi: 10.1371/journal.pone.0153384. eCollection 2016.
6
Significant communities in large sparse networks.大稀疏网络中的重要社区。
PLoS One. 2012;7(3):e33721. doi: 10.1371/journal.pone.0033721. Epub 2012 Mar 30.
7
Missing and spurious interactions and the reconstruction of complex networks.缺失和虚假交互以及复杂网络的重构。
Proc Natl Acad Sci U S A. 2009 Dec 29;106(52):22073-8. doi: 10.1073/pnas.0908366106. Epub 2009 Dec 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验