Department of Interactive Media, School of Communication, Hong Kong Baptist University, Kowloon, Hong Kong.
Department of Media and Communication, City University of Hong Kong, Kowloon, Hong Kong.
Cyberpsychol Behav Soc Netw. 2023 Jul;26(7):527-534. doi: 10.1089/cyber.2022.0158. Epub 2023 May 3.
Artificial intelligence (AI) has been increasingly integrated into content moderation to detect and remove hate speech on social media. An online experiment ( = 478) was conducted to examine how moderation agents (AI vs. human vs. human-AI collaboration) and removal explanations (with vs. without) affect users' perceptions and acceptance of removal decisions for hate speech targeting social groups with certain characteristics, such as religion or sexual orientation. The results showed that individuals exhibit consistent levels of perceived trustworthiness and acceptance of removal decisions regardless of the type of moderation agent. When explanations for the content takedown were provided, removal decisions made jointly by humans and AI were perceived as more trustworthy than the same decisions made by humans alone, which increased users' willingness to accept the verdict. However, this moderated mediation effect was only significant when Muslims, not homosexuals, were the target of hate speech.
人工智能(AI)已越来越多地融入内容审核中,以检测和删除社交媒体上的仇恨言论。我们进行了一项在线实验( = 478),以研究审核人员(AI 与人类与 AI 协作)和删除说明(有与无)如何影响用户对仇恨言论的看法和接受程度,这些仇恨言论针对的是具有某些特征的社会群体,如宗教或性取向。结果表明,无论审核人员的类型如何,个体对可信赖性和对删除决策的接受程度都保持一致。当提供内容删除的解释时,由人类和 AI 共同做出的删除决策被认为比人类单独做出的决策更值得信赖,这增加了用户接受裁决的意愿。但是,当仇恨言论的目标是穆斯林,而不是同性恋者时,这种调节中介效应才具有统计学意义。