结合知识图谱和词嵌入进行球形主题建模。

Combining Knowledge Graph and Word Embeddings for Spherical Topic Modeling.

作者信息

Ennajari Hafsa, Bouguila Nizar, Bentahar Jamal

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3609-3623. doi: 10.1109/TNNLS.2021.3112045. Epub 2023 Jul 6.

DOI:10.1109/TNNLS.2021.3112045

Abstract

Probabilistic topic models are considered as an effective framework for text analysis that uncovers the main topics in an unlabeled set of documents. However, the inferred topics by traditional topic models are often unclear and not easy to interpret because they do not account for semantic structures in language. Recently, a number of topic modeling approaches tend to leverage domain knowledge to enhance the quality of the learned topics, but they still assume a multinomial or Gaussian document likelihood in the Euclidean space, which often results in information loss and poor performance. In this article, we propose a Bayesian embedded spherical topic model (ESTM) that combines both knowledge graph and word embeddings in a non-Euclidean curved space, the hypersphere, for better topic interpretability and discriminative text representations. Extensive experimental results show that our proposed model successfully uncovers interpretable topics and learns high-quality text representations useful for common natural language processing (NLP) tasks across multiple benchmark datasets.

摘要

概率主题模型被视为文本分析的有效框架，它能揭示一组未标记文档中的主要主题。然而，传统主题模型推断出的主题往往不清晰且难以解释，因为它们没有考虑语言中的语义结构。最近，许多主题建模方法倾向于利用领域知识来提高所学习主题的质量，但它们仍然在欧几里得空间中假设多项式或高斯文档似然性，这常常导致信息丢失和性能不佳。在本文中，我们提出了一种贝叶斯嵌入球面主题模型（ESTM），该模型在非欧几里得弯曲空间（超球面）中结合了知识图谱和词嵌入，以实现更好的主题可解释性和判别性文本表示。大量实验结果表明，我们提出的模型成功地揭示了可解释的主题，并学习到了对多个基准数据集上的常见自然语言处理（NLP）任务有用的高质量文本表示。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

结合知识图谱和词嵌入进行球形主题建模。

Combining Knowledge Graph and Word Embeddings for Spherical Topic Modeling.

作者信息

出版信息

相似文献

结合知识图谱和词嵌入进行球形主题建模。

Combining Knowledge Graph and Word Embeddings for Spherical Topic Modeling.

作者信息

出版信息

相似文献