Ali Mehdi, Berrendorf Max, Hoyt Charles Tapley, Vermue Laurent, Galkin Mikhail, Sharifzadeh Sahand, Fischer Asja, Tresp Volker, Lehmann Jens
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):8825-8845. doi: 10.1109/TPAMI.2021.3124805. Epub 2022 Nov 7.
The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. To assess the reproducibility of previously published results, we re-implemented and evaluated 21 models in the PyKEEN software package. In this paper, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all, as well as provide insight as to why this might be the case. We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our results highlight that the combination of model architecture, training approach, loss function, and the explicit modeling of inverse relations is crucial for a model's performance and is not only determined by its architecture. We provide evidence that several architectures can obtain results competitive to the state of the art when configured carefully. We have made all code, experimental configurations, results, and analyses available at https://github.com/pykeen/pykeen and https://github.com/pykeen/benchmarking.
最近发布的知识图谱嵌入模型在实现、训练和评估方面的异质性使得进行公平且全面的比较变得困难。为了评估先前发表结果的可重复性,我们在PyKEEN软件包中重新实现并评估了21个模型。在本文中,我们概述了哪些结果可以使用其报告的超参数进行重现,哪些只能使用替代超参数进行重现,哪些根本无法重现,以及说明出现这种情况的原因。然后,我们在四个数据集上进行了大规模基准测试,进行了数千次实验,计算时间达24,804 GPU小时。我们展示了关于最佳实践、每个模型的最佳配置以及相对于先前发表的最佳配置可以在哪些方面进行改进的见解。我们的结果表明,模型架构、训练方法、损失函数以及逆关系的显式建模的组合对于模型性能至关重要,并且不仅仅由其架构决定。我们提供了证据表明,经过精心配置,几种架构可以获得与当前技术水平相竞争的结果。我们已将所有代码、实验配置、结果和分析发布在https://github.com/pykeen/pykeen和https://github.com/pykeen/benchmarking上。