Ruelens Anna
Research Institute for Work and Society, University of Leuven, Parkstraat 47, Box 5300, 3000 Leuven, Belgium.
J Comput Soc Sci. 2022;5(1):731-749. doi: 10.1007/s42001-021-00148-2. Epub 2021 Oct 29.
While user-generated online content (UGC) is increasingly available, public opinion studies are yet to fully exploit the abundance and richness of online data. This study contributes to the practical knowledge of user-generated online content and machine learning techniques that can be used for the analysis of UGC. For this purpose, we explore the potential of user-generated content and present an application of natural language pre-processing, text mining and sentiment analysis to the question of public satisfaction with healthcare systems. Concretely, we analyze 634 online comments reflecting attitudes towards healthcare services in different countries. Our analysis identifies the frequency of topics related to healthcare services in textual content of the comments and attempts to classify and rank national healthcare systems based on the respondents' sentiment scores. In this paper, we describe our approach, summarize our main findings, and compare them with the results from cross-national surveys. Finally, we outline the typical limitations inherent in the analysis of user-generated online content and suggest avenues for future research.
虽然用户生成的在线内容(UGC)越来越容易获取,但舆论研究尚未充分利用在线数据的丰富性。本研究有助于丰富关于用户生成的在线内容以及可用于分析UGC的机器学习技术的实践知识。为此,我们探索了用户生成内容的潜力,并展示了自然语言预处理、文本挖掘和情感分析在公众对医疗系统满意度问题上的应用。具体而言,我们分析了634条反映不同国家对医疗服务态度的在线评论。我们的分析确定了评论文本内容中与医疗服务相关主题的出现频率,并尝试根据受访者的情感得分对各国医疗系统进行分类和排名。在本文中,我们描述了我们的方法,总结了主要发现,并将其与跨国调查结果进行比较。最后,我们概述了分析用户生成的在线内容时固有的典型局限性,并提出了未来研究的方向。