Laureate Caitlin Doogan Poet, Buntine Wray, Linger Henry
Faculty of IT, Monash University, Wellington Rd, Clayton, VIC 3800 Australia.
College of Engineering and Computer Science, VinUniversity, Vinhomes Ocean Park, Gia Lam District, Hanoi 10000 Vietnam.
Artif Intell Rev. 2023 May 1:1-33. doi: 10.1007/s10462-023-10471-x.
UNLABELLED: Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models' limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10462-023-10471-x.
未标注:最近,关于短文本主题模型的研究已经解决了社交媒体数据集的挑战。这些模型通常使用自动化方法进行评估。然而,最近的研究表明,这些评估方法并不能说明所生成的主题是否能为研究社交媒体数据的人员提供有意义的见解。由于缺乏对研究人员如何使用主题模型的了解,解决这一问题的努力,包括衡量自动化评估任务与人工评估任务之间的一致性,受到了阻碍。如果研究人员没有以最佳方式构建主题模型,或者以超出模型限制的方式使用它们,可能会出现进一步的问题。这些情况威胁到主题模型开发的有效性以及将主题建模作为一种方法的研究人员所产生的见解。然而,目前缺乏关于主题模型在应用研究中如何以及为何被使用的信息。因此,我们对189篇将主题建模用于社交媒体分析的文章进行了系统的文献综述,以了解主题模型如何以及为何被用于社交媒体分析。我们的结果表明,主题模型的开发与将其用于社交媒体分析的人员的需求不一致。我们发现研究人员对主题模型的使用并不理想。研究人员在构建和解释主题方面缺乏方法上的支持。我们为主题模型研究人员提供了一套建议,以解决这些问题,并弥合短文本主题模型开发与应用研究之间的差距。 补充信息:在线版本包含可在10.1007/s10462-023-10471-x获取的补充材料。
Artif Intell Rev. 2023-5-1
Cochrane Database Syst Rev. 2022-2-1
Early Hum Dev. 2020-11
Artif Intell Med. 2021-7
J Med Internet Res. 2019-10-30
Entropy (Basel). 2025-5-30
Digit Health. 2025-1-21