Haselmayer Martin, Jenny Marcelo
Department of Government, University of Vienna, Rooseveltplatz 3/1, 1090 Vienna, Austria.
Qual Quant. 2017;51(6):2623-2646. doi: 10.1007/s11135-016-0412-4. Epub 2016 Sep 21.
Sentiment is important in studies of news values, public opinion, negative campaigning or political polarization and an explosive expansion of digital textual data and fast progress in automated text analysis provide vast opportunities for innovative social science research. Unfortunately, tools currently available for automated sentiment analysis are mostly restricted to English texts and require considerable contextual adaption to produce valid results. We present a procedure for collecting fine-grained sentiment scores through crowdcoding to build a negative sentiment dictionary in a language and for a domain of choice. The dictionary enables the analysis of large text corpora that resource-intensive hand-coding struggles to cope with. We calculate the tonality of sentences from dictionary words and we validate these estimates with results from manual coding. The results show that the crowdbased dictionary provides efficient and valid measurement of sentiment. Empirical examples illustrate its use by analyzing the tonality of party statements and media reports.
情感在新闻价值、公众舆论、负面竞选或政治两极分化的研究中很重要,而数字文本数据的爆炸式增长和自动文本分析的快速发展为创新的社会科学研究提供了巨大机遇。不幸的是,目前可用于自动情感分析的工具大多仅限于英文文本,并且需要大量的上下文调整才能产生有效的结果。我们提出了一种通过众包编码收集细粒度情感分数的程序,以构建特定语言和选定领域的负面情感词典。该词典能够分析资源密集型手工编码难以处理的大型文本语料库。我们根据词典中的词汇计算句子的语气,并通过人工编码的结果验证这些估计。结果表明,基于众包的词典能够高效、有效地测量情感。实证例子通过分析政党声明和媒体报道的语气来说明其用途。