Suppr超能文献

构建和评估希腊语情感分析资源。

Building and evaluating resources for sentiment analysis in the Greek language.

作者信息

Tsakalidis Adam, Papadopoulos Symeon, Voskaki Rania, Ioannidou Kyriaki, Boididou Christina, Cristea Alexandra I, Liakata Maria, Kompatsiaris Yiannis

机构信息

1Department of Computer Science, University of Warwick, Coventry, UK.

The Alan Turing Institute, London, UK.

出版信息

Lang Resour Eval. 2018;52(4):1021-1044. doi: 10.1007/s10579-018-9420-4. Epub 2018 Jul 14.

Abstract

Sentiment lexicons and word embeddings constitute well-established sources of information for sentiment analysis in online social media. Although their effectiveness has been demonstrated in state-of-the-art sentiment analysis and related tasks in the English language, such publicly available resources are much less developed and evaluated for the Greek language. In this paper, we tackle the problems arising when analyzing text in such an under-resourced language. We present and make publicly available a rich set of such resources, ranging from a manually annotated lexicon, to semi-supervised word embedding vectors and annotated datasets for different tasks. Our experiments using different algorithms and parameters on our resources show promising results over standard baselines; on average, we achieve a 24.9% relative improvement in F-score on the cross-domain sentiment analysis task when training the same algorithms with our resources, compared to training them on more traditional feature sources, such as n-grams. Importantly, while our resources were built with the primary focus on the cross-domain sentiment analysis task, they also show promising results in related tasks, such as emotion analysis and sarcasm detection.

摘要

情感词典和词嵌入是在线社交媒体中情感分析的成熟信息来源。尽管它们的有效性已在英语的先进情感分析及相关任务中得到证明,但此类公开可用资源在希腊语方面的开发和评估要少得多。在本文中,我们解决了在分析这种资源匮乏语言的文本时出现的问题。我们展示并公开了一组丰富的此类资源,从手动注释的词典到半监督词嵌入向量以及针对不同任务的注释数据集。我们使用不同算法和参数对这些资源进行的实验表明,与使用更传统的特征源(如n-gram)训练相同算法相比,在跨域情感分析任务中,使用我们的资源训练时,F分数平均相对提高了24.9%。重要的是,虽然我们的资源主要是为跨域情感分析任务构建的,但它们在情感分析和讽刺检测等相关任务中也显示出了有希望的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db2d/6411313/f0998e6b66f5/10579_2018_9420_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验