Klimczak Karol Marek, Fryczak Jan Makary, Hadro Dominika, Fijałkowska Justyna
Institute of Management, Faculty of Organization and Management, Lodz University of Technology, Wólczańska 221, 93-005, Łódź, Poland.
Faculty of Economics and Finance, Wroclaw University of Economics and Business, Komandorska 118/120, 53-345, Wrocław, Poland.
MethodsX. 2024 May 10;12:102745. doi: 10.1016/j.mex.2024.102745. eCollection 2024 Jun.
This paper presents a technique for sentiment measurement in many languages. The method allows researchers to efficiently analyze corporate documents, management reports, and financial statements using python. When the texts are written in many languages, the method extracts equivalent cross-linguistic sentiment features that can be used for statistical analysis or machine learning. We use Open Multilingual WordNet, a large lexicon organizing words into semantic groups, as the knowledge base about word equivalence in more than 200 languages. We experiment with a parallel English-French corpus and find that our senitment measures across the two languages are comparable. The method produces a consistent classification of positive and negative texts in two languages, and sentiment measure values correlate. The paper provides a detailed account of the method and python code, So that it can be applied to other languages, text mining, quantitative communication studies, and management research.•Method to create equivalent sentiment measures in multiple languages•Based on established lexicons and WordNet•Validated for English and French.
本文提出了一种用于多种语言情感测量的技术。该方法允许研究人员使用Python有效地分析公司文件、管理报告和财务报表。当文本用多种语言编写时,该方法提取等效的跨语言情感特征,可用于统计分析或机器学习。我们使用开放多语言词网(Open Multilingual WordNet),这是一个将单词组织成语义组的大型词典,作为200多种语言中单词等价性的知识库。我们对一个平行的英法语料库进行了实验,发现我们在两种语言中的情感测量结果具有可比性。该方法对两种语言中的积极和消极文本进行了一致的分类,并且情感测量值具有相关性。本文详细介绍了该方法和Python代码,以便它可以应用于其他语言、文本挖掘、定量传播研究和管理研究。
•在多种语言中创建等效情感测量的方法
•基于已建立的词典和词网
•已针对英语和法语进行验证