Vinay Rasita, Spitale Giovanni, Biller-Andorno Nikola, Germani Federico
Institute of Biomedical Ethics and History of Medicine, University of Zurich, Zurich, Switzerland.
School of Medicine, University of St. Gallen, St. Gallen, Switzerland.
Front Artif Intell. 2025 Apr 7;8:1543603. doi: 10.3389/frai.2025.1543603. eCollection 2025.
INTRODUCTION: The emergence of artificial intelligence (AI) large language models (LLMs), which can produce text that closely resembles human-written content, presents both opportunities and risks. While these developments offer significant opportunities for improving communication, such as in health-related crisis communication, they also pose substantial risks by facilitating the creation of convincing fake news and disinformation. The widespread dissemination of AI-generated disinformation adds complexity to the existing challenges of the ongoing infodemic, significantly affecting public health and the stability of democratic institutions. RATIONALE: Prompt engineering is a technique that involves the creation of specific queries given to LLMs. It has emerged as a strategy to guide LLMs in generating the desired outputs. Recent research shows that the output of LLMs depends on emotional framing within prompts, suggesting that incorporating emotional cues into prompts could influence their response behavior. In this study, we investigated how the politeness or impoliteness of prompts affects the frequency of disinformation generation by various LLMs. RESULTS: We generated and evaluated a corpus of 19,800 social media posts on public health topics to assess the disinformation generation capabilities of OpenAI's LLMs, including davinci-002, davinci-003, gpt-3.5-turbo, and gpt-4. Our findings revealed that all LLMs efficiently generated disinformation (davinci-002, 67%; davinci-003, 86%; gpt-3.5-turbo, 77%; and gpt-4, 99%). Introducing polite language to prompt requests yielded significantly higher success rates for disinformation (davinci-002, 79%; davinci-003, 90%; gpt-3.5-turbo, 94%; and gpt-4, 100%). Impolite prompting resulted in a significant decrease in disinformation production across all models (davinci-002, 59%; davinci-003, 44%; and gpt-3.5-turbo, 28%) and a slight reduction for gpt-4 (94%). CONCLUSION: Our study reveals that all tested LLMs effectively generate disinformation. Notably, emotional prompting had a significant impact on disinformation production rates, with models showing higher success rates when prompted with polite language compared to neutral or impolite requests. Our investigation highlights that LLMs can be exploited to create disinformation and emphasizes the critical need for ethics-by-design approaches in developing AI technologies. We maintain that identifying ways to mitigate the exploitation of LLMs through emotional prompting is crucial to prevent their misuse for purposes detrimental to public health and society.
引言:人工智能(AI)大语言模型(LLM)的出现,其生成的文本与人类撰写的内容极为相似,这既带来了机遇,也带来了风险。虽然这些进展为改善沟通提供了重大机遇,比如在与健康相关的危机沟通中,但它们也通过助长令人信服的假新闻和虚假信息的产生带来了巨大风险。人工智能生成的虚假信息的广泛传播给当前信息疫情的现有挑战增添了复杂性,严重影响公众健康和民主机构的稳定性。 原理:提示工程是一种涉及为大语言模型创建特定查询的技术。它已成为引导大语言模型生成所需输出的一种策略。最近的研究表明,大语言模型的输出取决于提示中的情感框架,这表明将情感线索纳入提示可能会影响其响应行为。在本研究中,我们调查了提示的礼貌或不礼貌如何影响各种大语言模型生成虚假信息的频率。 结果:我们生成并评估了一个包含19800条关于公共卫生主题的社交媒体帖子的语料库,以评估OpenAI的大语言模型生成虚假信息的能力,包括davinci - 002、davinci - 003、gpt - 3.5 - turbo和gpt - 4。我们的研究结果显示,所有大语言模型都能高效生成虚假信息(davinci - 002为67%;davinci - 003为86%;gpt - 3.5 - turbo为77%;gpt - 4为99%)。在提示请求中引入礼貌语言会使虚假信息的成功率显著提高(davinci - 002为79%;davinci - 003为90%;gpt - 3.5 - turbo为94%;gpt - 4为100%)。不礼貌的提示导致所有模型生成虚假信息的情况显著减少(davinci - 002为59%;davinci - 003为44%;gpt - 3.5 - turbo为28%),gpt - 4略有下降(94%)。 结论:我们的研究表明,所有测试的大语言模型都能有效生成虚假信息。值得注意的是,情感提示对虚假信息生成率有显著影响,与中性或不礼貌请求相比,使用礼貌语言提示时模型显示出更高的成功率。我们的调查强调,大语言模型可被利用来制造虚假信息,并强调在开发人工智能技术时采用设计即伦理方法的迫切需求。我们认为,确定通过情感提示减轻大语言模型被利用的方法对于防止其被滥用于损害公众健康和社会的目的至关重要。
Front Artif Intell. 2025-4-7
J Am Med Inform Assoc. 2025-3-1
Cochrane Database Syst Rev. 2022-5-20
J Am Med Inform Assoc. 2024-9-1
JMIR Infodemiology. 2025-6-23
Sci Adv. 2023-6-28
J Control Release. 2022-12