Cécillon Noé, Labatut Vincent, Dufour Richard, Linarès Georges
LIA, Avignon University, Avignon, France.
Front Big Data. 2019 Jun 4;2:8. doi: 10.3389/fdata.2019.00008. eCollection 2019.
In recent years, online social networks have allowed world-wide users to meet and discuss. As guarantors of these communities, the administrators of these platforms must prevent users from adopting inappropriate behaviors. This verification task, mainly done by humans, is more and more difficult due to the ever growing amount of messages to check. Methods have been proposed to automatize this moderation process, mainly by providing approaches based on the textual content of the exchanged messages. Recent work has also shown that characteristics derived from the structure of conversations, in the form of conversational graphs, can help detecting these abusive messages. In this paper, we propose to take advantage of both sources of information by proposing fusion methods integrating content- and graph-based features. Our experiments on raw chat logs show not only that the content of the messages, but also their dynamics within a conversation contain partially complementary information, allowing performance improvements on an abusive message classification task with a final -measure of 93.26%.
近年来,在线社交网络使全球用户能够相互交流和讨论。作为这些社区的管理者,这些平台的管理员必须防止用户采取不当行为。这项验证任务主要由人工完成,但由于需要检查的消息数量不断增加,难度越来越大。已经有人提出了一些方法来使这个审核过程自动化,主要是通过提供基于所交换消息文本内容的方法。最近的研究还表明,以对话图的形式从对话结构中衍生出的特征有助于检测这些辱骂性消息。在本文中,我们提出通过融合基于内容和基于图的特征的方法来利用这两种信息来源。我们对原始聊天记录的实验表明,不仅消息的内容,而且它们在对话中的动态变化都包含部分互补信息,从而在辱骂性消息分类任务中实现了性能提升,最终F1值达到93.26%。