Suppr超能文献

评估ChatGPT聊天机器人对戒烟公共卫生指南的遵循情况:内容分析

Assessing the Adherence of ChatGPT Chatbots to Public Health Guidelines for Smoking Cessation: Content Analysis.

作者信息

Abroms Lorien C, Yousefi Artin, Wysota Christina N, Wu Tien-Chin, Broniatowski David A

机构信息

Department of Prevention & Community Health, Milken Institute School of Public Health, George Washington University, Washington, DC, United States.

Department of Engineering Management and Systems Engineering, George Washington University, Washington, DC, United States.

出版信息

J Med Internet Res. 2025 Jan 30;27:e66896. doi: 10.2196/66896.

Abstract

BACKGROUND

Large language model (LLM) artificial intelligence chatbots using generative language can offer smoking cessation information and advice. However, little is known about the reliability of the information provided to users.

OBJECTIVE

This study aims to examine whether 3 ChatGPT chatbots-the World Health Organization's Sarah, BeFreeGPT, and BasicGPT-provide reliable information on how to quit smoking.

METHODS

A list of quit smoking queries was generated from frequent quit smoking searches on Google related to "how to quit smoking" (n=12). Each query was given to each chatbot, and responses were analyzed for their adherence to an index developed from the US Preventive Services Task Force public health guidelines for quitting smoking and counseling principles. Responses were independently coded by 2 reviewers, and differences were resolved by a third coder.

RESULTS

Across chatbots and queries, on average, chatbot responses were rated as being adherent to 57.1% of the items on the adherence index. Sarah's adherence (72.2%) was significantly higher than BeFreeGPT (50%) and BasicGPT (47.8%; P<.001). The majority of chatbot responses had clear language (97.3%) and included a recommendation to seek out professional counseling (80.3%). About half of the responses included the recommendation to consider using nicotine replacement therapy (52.7%), the recommendation to seek out social support from friends and family (55.6%), and information on how to deal with cravings when quitting smoking (44.4%). The least common was information about considering the use of non-nicotine replacement therapy prescription drugs (14.1%). Finally, some types of misinformation were present in 22% of responses. Specific queries that were most challenging for the chatbots included queries on "how to quit smoking cold turkey," "...with vapes," "...with gummies," "...with a necklace," and "...with hypnosis." All chatbots showed resilience to adversarial attacks that were intended to derail the conversation.

CONCLUSIONS

LLM chatbots varied in their adherence to quit-smoking guidelines and counseling principles. While chatbots reliably provided some types of information, they omitted other types, as well as occasionally provided misinformation, especially for queries about less evidence-based methods of quitting. LLM chatbot instructions can be revised to compensate for these weaknesses.

摘要

背景

使用生成式语言的大语言模型(LLM)人工智能聊天机器人可以提供戒烟信息和建议。然而,对于向用户提供的信息的可靠性知之甚少。

目的

本研究旨在检验3个ChatGPT聊天机器人——世界卫生组织的莎拉(Sarah)、BeFreeGPT和BasicGPT——是否能提供关于如何戒烟的可靠信息。

方法

从谷歌上与“如何戒烟”相关的频繁戒烟搜索中生成一份戒烟问题列表(n = 12)。每个问题都提供给每个聊天机器人,并根据美国预防服务工作组关于戒烟的公共卫生指南和咨询原则制定的一个指标来分析回复的符合程度。回复由2名评审员独立编码,差异由第三名编码员解决。

结果

在所有聊天机器人和问题中,平均而言,聊天机器人的回复被评为符合符合程度指标中57.1%的项目。莎拉的符合度(72.2%)显著高于BeFreeGPT(50%)和BasicGPT(47.8%;P <.001)。大多数聊天机器人的回复语言清晰(97.3%),并包括寻求专业咨询的建议(80.3%)。约一半的回复包括考虑使用尼古丁替代疗法的建议(52.7%)、寻求朋友和家人社会支持的建议(55.6%)以及关于戒烟时如何应对渴望的信息(44.4%)。最不常见的是关于考虑使用非尼古丁替代疗法处方药的信息(14.1%)。最后,22%的回复中存在一些错误信息。对聊天机器人最具挑战性的特定问题包括关于“如何突然戒烟”“……使用电子烟戒烟”“……使用软糖戒烟”“……使用项链戒烟”和“……使用催眠戒烟”的问题。所有聊天机器人都能抵御旨在扰乱对话的对抗性攻击。

结论

大语言模型聊天机器人在遵循戒烟指南和咨询原则方面存在差异。虽然聊天机器人可靠地提供了某些类型的信息,但它们遗漏了其他类型的信息,并且偶尔会提供错误信息,特别是对于关于证据较少的戒烟方法的问题。可以修改大语言模型聊天机器人的指令以弥补这些弱点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48bd/11826940/4f609834d16e/jmir_v27i1e66896_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验