Lovelace Justin, Newman-Griffis Denis, Vashishth Shikhar, Lehman Jill Fain, Rosé Carolyn Penstein
Language Technologies Institute, Carnegie Mellon University, USA.
Department of Biomedical Informatics, University of Pittsburgh, USA.
Proc Conf Assoc Comput Linguist Meet. 2021 Aug;2021:1016-1029. doi: 10.18653/v1/2021.acl-long.82.
Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent KG completion methods in this challenging setting. We find that our model's performance improvements stem primarily from its robustness to sparsity. We then distill the knowledge from the convolutional network into a student network that re-ranks promising candidate entities. This re-ranking stage leads to further improvements in performance and demonstrates the effectiveness of entity re-ranking for KG completion.
知识图谱(KG)补全研究通常集中在密集连接的基准数据集上,这些数据集并不代表真实的知识图谱。我们精心策划了两个包含生物医学和百科知识的知识图谱数据集,并使用现有的常识知识图谱数据集,在不保证密集连接性的更现实场景中探索知识图谱补全。我们开发了一种利用文本实体表示的深度卷积网络,并证明我们的模型在这一具有挑战性的场景中优于近期的知识图谱补全方法。我们发现,我们模型的性能提升主要源于其对稀疏性的鲁棒性。然后,我们将卷积网络中的知识提炼到一个学生网络中,该网络对有前景的候选实体进行重新排序。这个重新排序阶段带来了性能的进一步提升,并证明了实体重新排序对知识图谱补全的有效性。