The Alan Turing Institute, London, United Kingdom.
Department of Computer Science, IT University of Copenhagen, Copenhagen, Denmark.
PLoS One. 2020 Dec 28;15(12):e0243300. doi: 10.1371/journal.pone.0243300. eCollection 2020.
Data-driven and machine learning based approaches for detecting, categorising and measuring abusive content such as hate speech and harassment have gained traction due to their scalability, robustness and increasingly high performance. Making effective detection systems for abusive content relies on having the right training datasets, reflecting a widely accepted mantra in computer science: Garbage In, Garbage Out. However, creating training datasets which are large, varied, theoretically-informed and that minimize biases is difficult, laborious and requires deep expertise. This paper systematically reviews 63 publicly available training datasets which have been created to train abusive language classifiers. It also reports on creation of a dedicated website for cataloguing abusive language data hatespeechdata.com. We discuss the challenges and opportunities of open science in this field, and argue that although more dataset sharing would bring many benefits it also poses social and ethical risks which need careful consideration. Finally, we provide evidence-based recommendations for practitioners creating new abusive content training datasets.
基于数据驱动和机器学习的方法在检测、分类和衡量恶意内容(如仇恨言论和骚扰)方面取得了进展,因为它们具有可扩展性、鲁棒性和越来越高的性能。要开发有效的恶意内容检测系统,需要有合适的训练数据集,这反映了计算机科学中一个广为接受的原则:垃圾进,垃圾出。然而,创建大规模、多样化、理论上有依据且最小化偏差的训练数据集是困难的、费力的,并且需要深厚的专业知识。本文系统地回顾了 63 个已创建用于训练恶意语言分类器的公开可用训练数据集,并报告了创建专门的辱骂语言数据目录网站 hatespeechdata.com 的情况。我们讨论了该领域开放科学的挑战和机遇,并认为虽然更多的数据共享将带来许多好处,但也带来了需要谨慎考虑的社会和伦理风险。最后,我们为创建新的恶意内容训练数据集的从业者提供了基于证据的建议。