Allaway Emily, McKeown Kathleen
Department of Computer Science, Columbia University, New York, NY, United States.
Front Artif Intell. 2023 Jan 13;5:1070429. doi: 10.3389/frai.2022.1070429. eCollection 2022.
A major challenge in stance detection is the large (potentially infinite) and diverse set of stance topics. Collecting data for such a set is unrealistic due to both the expense of annotation and the continuous creation of new real-world topics (e.g., a new politician runs for office). Furthermore, stancetaking occurs in a wide range of languages and genres (e.g., Twitter, news articles). While zero-shot stance detection in English, where evaluation is on topics not seen during training, has received increasing attention, we argue that this attention should be expanded to multilingual and multi-genre settings. We discuss two paradigms for English zero-shot stance detection evaluation, as well as recent work in this area. We then discuss recent work on multilingual and multi-genre stance detection, which has focused primarily on non-zero-shot settings. We argue that this work should be expanded to multilingual and multi-genre zero-shot stance detection and propose best practices to systematize and stimulate future work in this direction. While domain adaptation techniques are well-suited for work in these settings, we argue that increased care should be taken to improve model explainability and to conduct robust evaluations, considering not only empirical generalization ability but also the understanding of complex language and inferences.
立场检测中的一个主要挑战是立场主题的集合庞大(可能是无限的)且多样。由于标注成本以及新的现实世界主题不断涌现(例如,一位新政治家竞选公职),为这样一个集合收集数据是不现实的。此外,立场表达出现在多种语言和体裁中(例如,推特、新闻文章)。虽然英语中的零样本立场检测(即在训练期间未见过的主题上进行评估)受到了越来越多的关注,但我们认为这种关注应扩展到多语言和多体裁设置。我们讨论了英语零样本立场检测评估的两种范式,以及该领域的近期工作。然后我们讨论了多语言和多体裁立场检测的近期工作,这些工作主要集中在非零样本设置上。我们认为这项工作应扩展到多语言和多体裁零样本立场检测,并提出最佳实践,以系统化和推动这一方向的未来工作。虽然领域适应技术非常适合在这些设置中开展工作,但我们认为应更加谨慎地提高模型的可解释性,并进行稳健的评估,不仅要考虑经验泛化能力,还要考虑对复杂语言和推理的理解。