Shishaev Maksim, Dikovitsky Vladimir, Pimeshkov Vadim, Kuprikov Nikita, Kuprikov Mikhail, Shkodyrev Viacheslav
Putilov Institute for Informatics and Mathematical Modeling, Kola Science Centre of the Russian Academy of Sciences, Apatity, Russia.
Peter the Great St.Petersburg Polytechnic University, Saint Petersburg, Russia.
PeerJ Comput Sci. 2023 Oct 11;9:e1636. doi: 10.7717/peerj-cs.1636. eCollection 2023.
The article investigates the possibility of identifying the presence of SKOS (Simple Knowledge Organization System) relations between concepts represented by terms on the base of their vector representation in general natural language models. Several language models of the Word2Vec and GloVe families are considered, on the basis of which an artificial neural network (ANN) classifier of SKOS relations is formed. To train and test the efficiency of the classifier, datasets formed on the basis of the DBPedia and EuroVoc thesauri are used. The experiments performed have shown the high efficiency of the classifier trained using GloVe family models, while training it with use of Word2Vec models looks impossible in the bounds of considered ANN-based classifier architecture. Based on the results, a conclusion is made about the key role of taking into account the global context of the use of terms in the text for the possibility of identifying SKOS relations.
本文研究了基于通用自然语言模型中术语的向量表示来识别由术语表示的概念之间存在简单知识组织系统(SKOS)关系的可能性。考虑了Word2Vec和GloVe家族的几种语言模型,并在此基础上形成了SKOS关系的人工神经网络(ANN)分类器。为了训练和测试分类器的效率,使用了基于DBPedia和EuroVoc叙词表形成的数据集。所进行的实验表明,使用GloVe家族模型训练的分类器效率很高,而在考虑的基于人工神经网络的分类器架构范围内,使用Word2Vec模型进行训练似乎是不可能的。基于这些结果,得出了一个结论,即考虑文本中术语使用的全局上下文对于识别SKOS关系的可能性起着关键作用。