Suppr超能文献

关键词评估中直接匹配的失败之外:基于图的解决方案概述。

Beyond the Failure of Direct-Matching in Keyword Evaluation: A Sketch of a Graph Based Solution.

作者信息

Kölbl Max, Kyogoku Yuki, Philipp J Nathanael, Richter Michael, Rietdorf Clements, Yousef Tariq

机构信息

Institute of Computer Science, NLP Group, Universität Leipzig, Leipzig, Germany.

出版信息

Front Artif Intell. 2022 Mar 24;5:801564. doi: 10.3389/frai.2022.801564. eCollection 2022.

Abstract

The starting point of this paper is the observation that methods based on the direct match of keywords are inadequate because they do not consider the cognitive ability of concept formation and abstraction. We argue that keyword evaluation needs to be based on a semantic model of language capturing the semantic relatedness of words to satisfy the claim of the human-like ability of concept formation and abstraction and achieve better evaluation results. Evaluation of keywords is difficult since semantic informedness is required for this purpose. This model must be capable of identifying semantic relationships such as synonymy, hypernymy, hyponymy, and location-based abstraction. For example, when gathering texts from online sources, one usually finds a few keywords with each text. Still, these keyword sets are neither complete for the text nor are they in themselves closed, i.e., in most cases, the keywords are a random subset of all possible keywords and not that informative w.r.t. the complete keyword set. Therefore all algorithms based on this cannot achieve good evaluation results and provide good/better keywords or even a complete keyword set for a text. As a solution, we propose a word graph that captures all these semantic relationships for a given language. The problem with the hyponym/hyperonym relationship is that, unlike synonyms, it is not bidirectional. Thus the space of keyword sets requires a metric that is non-symmetric, in other words, a . We sketch such a metric that works on our graph. Since it is nearly impossible to obtain such a complete word graph for a language, we propose for the keyword task a simpler graph based on the base text upon which the keyword sets should be evaluated. This reduction is usually sufficient for evaluating keyword sets.

摘要

本文的出发点是观察到基于关键词直接匹配的方法并不充分,因为它们没有考虑概念形成和抽象的认知能力。我们认为,关键词评估需要基于一种语言语义模型,该模型能够捕捉词的语义相关性,以满足类似人类概念形成和抽象能力的要求,并取得更好的评估结果。由于关键词评估需要语义信息,因此这一过程颇具难度。该模型必须能够识别诸如同义、上位、下位以及基于位置的抽象等语义关系。例如,从在线来源收集文本时,通常会为每篇文本找到一些关键词。然而,这些关键词集对于文本而言既不完整,其本身也不是封闭的,即,在大多数情况下,关键词只是所有可能关键词的随机子集,相对于完整的关键词集而言信息量不足。因此,基于此的所有算法都无法取得良好的评估结果,也无法为文本提供优质/更好的关键词,甚至完整的关键词集。作为一种解决方案,我们提出一种词图,它能够捕捉给定语言中的所有这些语义关系。下位词/上位词关系的问题在于,与同义词不同,它不是双向的。因此,关键词集的空间需要一种非对称的度量,换句话说,是一种……我们勾勒出一种适用于我们的图的度量。由于几乎不可能为一种语言获得如此完整的词图,我们针对关键词任务提出一种基于基础文本的更简单的图,应根据该基础文本评估关键词集。这种简化通常足以评估关键词集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f71c/8988042/be8d53250ad9/frai-05-801564-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验