• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较人类治疗师和基于大语言模型的聊天机器人的反应以评估治疗性沟通:混合方法研究

A Comparison of Responses from Human Therapists and Large Language Model-Based Chatbots to Assess Therapeutic Communication: Mixed Methods Study.

作者信息

Scholich Till, Barr Maya, Wiltsey Stirman Shannon, Raj Shriti

机构信息

Institute for Human-Centered AI, Stanford University, Stanford, CA, United States.

PGSP-Stanford PsyD Consortium, Palo Alto University, Palo Alto, CA, United States.

出版信息

JMIR Ment Health. 2025 May 21;12:e69709. doi: 10.2196/69709.

DOI:10.2196/69709
PMID:40397927
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12138294/
Abstract

BACKGROUND

Consumers are increasingly using large language model-based chatbots to seek mental health advice or intervention due to ease of access and limited availability of mental health professionals. However, their suitability and safety for mental health applications remain underexplored, particularly in comparison to professional therapeutic practices.

OBJECTIVE

This study aimed to evaluate how general-purpose chatbots respond to mental health scenarios and compare their responses to those provided by licensed therapists. Specifically, we sought to identify chatbots' strengths and limitations, as well as the ethical and practical considerations necessary for their use in mental health care.

METHODS

We conducted a mixed methods study to compare responses from chatbots and licensed therapists to scripted mental health scenarios. We created 2 fictional scenarios and prompted 3 chatbots to create 6 interaction logs. We recruited 17 therapists and conducted study sessions that consisted of 3 activities. First, therapists responded to the 2 scenarios using a Qualtrics form. Second, therapists went through the 6 interaction logs using a think-aloud procedure to highlight their thoughts about the chatbots' responses. Finally, we conducted a semistructured interview to explore subjective opinions on the use of chatbots for supporting mental health. The study sessions were analyzed using thematic analysis. The interaction logs from chatbot and therapist responses were coded using the Multitheoretical List of Therapeutic Interventions codes and then compared to each other.

RESULTS

We identified 7 themes describing the strengths and limitations of the chatbots as compared to therapists. These include elements of good therapy in chatbot responses, conversational style of chatbots, insufficient inquiry and feedback seeking by chatbots, chatbot interventions, client engagement, chatbots' responses to crisis situations, and considerations for chatbot-based therapy. In the use of Multitheoretical List of Therapeutic Interventions codes, we found that therapists evoked more elaboration (Mann-Whitney U=9; P=.001) and used more self-disclosure (U=45.5; P=.37) as compared to the chatbots. The chatbots used affirming (U=28; P=.045) and reassuring (U=23; P=.02) language more often than the therapists. The chatbots also used psychoeducation (U=22.5; P=.02) and suggestions (U=12.5; P=.003) more often than the therapists.

CONCLUSIONS

Our study demonstrates the unsuitability of general-purpose chatbots to safely engage in mental health conversations, particularly in crisis situations. While chatbots display elements of good therapy, such as validation and reassurance, overuse of directive advice without sufficient inquiry and use of generic interventions make them unsuitable as therapeutic agents. Careful research and evaluation will be necessary to determine the impact of chatbot interactions and to identify the most appropriate use cases related to mental health.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7d6/12138294/e6d88495be91/mental_v12i1e69709_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7d6/12138294/e6d88495be91/mental_v12i1e69709_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7d6/12138294/e6d88495be91/mental_v12i1e69709_fig1.jpg
摘要

背景

由于心理健康专业人员数量有限且获取方便,消费者越来越多地使用基于大语言模型的聊天机器人来寻求心理健康建议或干预。然而,它们在心理健康应用中的适用性和安全性仍未得到充分探索,尤其是与专业治疗实践相比。

目的

本研究旨在评估通用聊天机器人如何应对心理健康场景,并将其回复与持牌治疗师提供的回复进行比较。具体而言,我们试图确定聊天机器人的优势和局限性,以及在心理健康护理中使用它们所需的伦理和实际考虑因素。

方法

我们进行了一项混合方法研究,以比较聊天机器人和持牌治疗师对脚本化心理健康场景的回复。我们创建了2个虚构场景,并促使3个聊天机器人生成6个交互日志。我们招募了17名治疗师,并开展了由3项活动组成的研究环节。首先,治疗师使用Qualtrics表单对这2个场景做出回应。其次,治疗师通过出声思考程序浏览这6个交互日志,以突出他们对聊天机器人回复的看法。最后,我们进行了一次半结构化访谈,以探讨关于使用聊天机器人支持心理健康的主观意见。使用主题分析对研究环节进行分析。聊天机器人和治疗师回复的交互日志使用《治疗干预多理论列表》代码进行编码,然后相互比较。

结果

我们确定了7个主题,描述了与治疗师相比聊天机器人的优势和局限性。这些包括聊天机器人回复中的良好治疗要素、聊天机器人的对话风格、聊天机器人询问和寻求反馈不足、聊天机器人干预、客户参与度、聊天机器人对危机情况的回复以及基于聊天机器人的治疗的考虑因素。在使用《治疗干预多理论列表》代码时,我们发现与聊天机器人相比,治疗师引发了更多的阐述(曼-惠特尼U=9;P=.001),并且更多地使用了自我表露(U=45.5;P=.37)。聊天机器人比治疗师更频繁地使用肯定性(U=28;P=.045)和安慰性(U=23;P=.02)语言。聊天机器人也比治疗师更频繁地使用心理教育(U=22.5;P=.02)和建议(U=12.5;P=.003)。

结论

我们的研究表明通用聊天机器人不适合安全地参与心理健康对话,尤其是在危机情况下。虽然聊天机器人展示了良好治疗的要素,如确认和安慰,但在没有充分询问的情况下过度使用指导性建议以及使用通用干预措施使其不适合作为治疗手段。需要进行仔细的研究和评估,以确定聊天机器人交互的影响,并确定与心理健康相关的最合适使用案例。

相似文献

1
A Comparison of Responses from Human Therapists and Large Language Model-Based Chatbots to Assess Therapeutic Communication: Mixed Methods Study.比较人类治疗师和基于大语言模型的聊天机器人的反应以评估治疗性沟通:混合方法研究
JMIR Ment Health. 2025 May 21;12:e69709. doi: 10.2196/69709.
2
Expert and Interdisciplinary Analysis of AI-Driven Chatbots for Mental Health Support: Mixed Methods Study.用于心理健康支持的人工智能驱动聊天机器人的专家和跨学科分析:混合方法研究。
J Med Internet Res. 2025 Apr 25;27:e67114. doi: 10.2196/67114.
3
Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study.前瞻性评估 4 种大型语言模型聊天机器人对患者关于急救护理问题的回答的准确性:实验性对比研究。
J Med Internet Res. 2024 Nov 4;26:e60291. doi: 10.2196/60291.
4
Evaluation of the Current State of Chatbots for Digital Health: Scoping Review.评估数字健康领域的聊天机器人现状:范围综述。
J Med Internet Res. 2023 Dec 19;25:e47217. doi: 10.2196/47217.
5
Therapeutic Potential of Social Chatbots in Alleviating Loneliness and Social Anxiety: Quasi-Experimental Mixed Methods Study.社交聊天机器人在缓解孤独感和社交焦虑方面的治疗潜力:准实验性混合方法研究
J Med Internet Res. 2025 Jan 14;27:e65589. doi: 10.2196/65589.
6
Artificial Intelligence Chatbot Behavior Change Model for Designing Artificial Intelligence Chatbots to Promote Physical Activity and a Healthy Diet: Viewpoint.人工智能聊天机器人行为改变模型设计人工智能聊天机器人促进身体活动和健康饮食:观点。
J Med Internet Res. 2020 Sep 30;22(9):e22845. doi: 10.2196/22845.
7
The Efficacy of Conversational AI in Rectifying the Theory-of-Mind and Autonomy Biases: Comparative Analysis.对话式人工智能纠正心理理论和自主性偏差的功效:比较分析
JMIR Ment Health. 2025 Feb 7;12:e64396. doi: 10.2196/64396.
8
Exploring the Ethical Challenges of Conversational AI in Mental Health Care: Scoping Review.探索心理健康护理中对话式人工智能的伦理挑战:范围审查
JMIR Ment Health. 2025 Feb 21;12:e60432. doi: 10.2196/60432.
9
Putting ChatGPT's Medical Advice to the (Turing) Test: Survey Study.对ChatGPT的医学建议进行(图灵)测试:调查研究。
JMIR Med Educ. 2023 Jul 10;9:e46939. doi: 10.2196/46939.
10
Large Language Model (LLM)-Powered Chatbots Fail to Generate Guideline-Consistent Content on Resuscitation and May Provide Potentially Harmful Advice.大型语言模型 (LLM) 驱动的聊天机器人无法生成与复苏指南一致的内容,并且可能提供潜在有害的建议。
Prehosp Disaster Med. 2023 Dec;38(6):757-763. doi: 10.1017/S1049023X23006568. Epub 2023 Nov 6.

本文引用的文献

1
Large language models in patient education: a scoping review of applications in medicine.用于患者教育的大语言模型:医学应用的范围综述
Front Med (Lausanne). 2024 Oct 29;11:1477898. doi: 10.3389/fmed.2024.1477898. eCollection 2024.
2
Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.大语言模型对诊断推理的影响:一项随机临床试验。
JAMA Netw Open. 2024 Oct 1;7(10):e2440969. doi: 10.1001/jamanetworkopen.2024.40969.
3
Describing the Framework for AI Tool Assessment in Mental Health and Applying It to a Generative AI Obsessive-Compulsive Disorder Platform: Tutorial.
描述心理健康人工智能工具评估框架,并将其应用于生成式人工智能强迫症平台:教程。
JMIR Form Res. 2024 Oct 18;8:e62963. doi: 10.2196/62963.
4
The Opportunities and Risks of Large Language Models in Mental Health.大语言模型在精神健康中的机遇与风险。
JMIR Ment Health. 2024 Jul 29;11:e59479. doi: 10.2196/59479.
5
Can Large Language Models Replace Therapists? Evaluating Performance at Simple Cognitive Behavioral Therapy Tasks.大语言模型能取代治疗师吗?评估在简单认知行为疗法任务中的表现。
JMIR AI. 2024 Jul 30;3:e52500. doi: 10.2196/52500.
6
Roles, Users, Benefits, and Limitations of Chatbots in Health Care: Rapid Review.医疗保健中聊天机器人的角色、用户、益处和局限性:快速综述。
J Med Internet Res. 2024 Jul 23;26:e56930. doi: 10.2196/56930.
7
Large Language Models Versus Expert Clinicians in Crisis Prediction Among Telemental Health Patients: Comparative Study.大语言模型与专家临床医生在远程心理健康患者危机预测中的比较研究。
JMIR Ment Health. 2024 Aug 2;11:e58129. doi: 10.2196/58129.
8
Loneliness and suicide mitigation for students using GPT3-enabled chatbots.使用支持GPT-3的聊天机器人减轻学生的孤独感和自杀风险
Npj Ment Health Res. 2024 Jan 22;3(1):4. doi: 10.1038/s44184-023-00047-6.
9
Charting new AI education in gastroenterology: Cross-sectional evaluation of ChatGPT and perplexity AI in medical residency exam.绘制胃肠病学新的人工智能教育图表:ChatGPT 和 perplexity AI 在医学住院医师考试中的横断面评估。
Dig Liver Dis. 2024 Aug;56(8):1304-1311. doi: 10.1016/j.dld.2024.02.019. Epub 2024 Mar 19.
10
Understanding the Benefits and Challenges of Using Large Language Model-based Conversational Agents for Mental Well-being Support.了解使用基于大型语言模型的对话代理来支持心理健康的好处和挑战。
AMIA Annu Symp Proc. 2024 Jan 11;2023:1105-1114. eCollection 2023.