Ibrohim Muhammad Okky, Budi Indra
Dipartimento di Informatica, Università degli Studi di Torino, Italy.
Faculty of Computer Science, Universitas Indonesia, Indonesia.
Heliyon. 2023 Jul 28;9(8):e18647. doi: 10.1016/j.heliyon.2023.e18647. eCollection 2023 Aug.
Nowadays Hate Speech and Abusive Language (HSAL) have spread extensively over social media. The easy use of social media allows people to abuse the media to spread HSAL. Hate speech and abusive language in social media must be detected because they can trigger conflict among citizens. Not only in social media, but HSAL also often trigger conflict in real life. In recent years, many scholars have researched HSAL detection in various languages and media. However, there are still many tasks on HSAL detection that need to be done to develop a better HSAL detection system. This paper discusses a summary of Indonesian HSAL detection research, conducted by utilizing the Kitchenham systematic literature review method. Based on our summary, we found that most Indonesian HSAL research still uses the classic machine-learning approach with classic text representation features that experimented on the Twitter text dataset. We also found several challenges and tasks that need to be addressed to build a better HSAL detection system in Indonesian social media that can detect the hate speech target, category, and levels; and the hate speech buzzer, thread starter, and fake account spreader.
如今,仇恨言论和辱骂性语言(HSAL)在社交媒体上广泛传播。社交媒体的便捷使用使得人们能够利用其传播HSAL。必须检测社交媒体中的仇恨言论和辱骂性语言,因为它们可能引发公民之间的冲突。HSAL不仅在社交媒体中引发冲突,在现实生活中也常常如此。近年来,许多学者针对多种语言和媒介的HSAL检测展开了研究。然而,要开发出更优的HSAL检测系统,在HSAL检测方面仍有许多工作要做。本文运用Kitchenham系统文献综述方法,对印度尼西亚HSAL检测研究进行了总结。基于我们的总结发现,大多数印度尼西亚HSAL研究仍采用经典机器学习方法,使用经典文本表示特征,并在推特文本数据集上进行实验。我们还发现了在印度尼西亚社交媒体中构建更好的HSAL检测系统时需要解决的若干挑战和任务,该系统要能够检测仇恨言论的目标、类别和级别,以及仇恨言论的煽动者、话题发起者和虚假账户传播者。