Dong Wanghao, Wang Weijun, Han Xinheng, Huang Junhao, Li Lie, Huang Yinghui
Key Laboratory of Adolescent Cyberpsychology and Behavior (Ministry of Education), 152 Luoyu Road, Hongshan District, Wuhan, Hubei Province, 430079, China; School of Psychology, Central China Normal University, 152 Luoyu Road, Hongshan District, Wuhan, Hubei Province, 430079, China.
School of Psychology, Central China Normal University, 152 Luoyu Road, Hongshan District, Wuhan, Hubei Province, 430079, China.
Comput Biol Med. 2025 Sep;196(Pt B):110789. doi: 10.1016/j.compbiomed.2025.110789. Epub 2025 Jul 23.
The global shortage of mental health services has sparked considerable interest in leveraging generative artificial intelligence (AI) to address psychological health challenges. This study systematically evaluates the emotional support capabilities of GPT-4 and explores ways to enhance its performance through targeted prompt engineering. Initially, natural language processing and explainable machine learning were employed to develop a predictive model for evaluating the empathy of GPT-4's responses. Feature sensitivity analysis identified key linguistic features that significantly influence empathy performance. These insights were then integrated with empathy component theory and helping skills theory to design prompt engineering to enhance GPT-4's emotional support capabilities. Evaluation results show that while unprompted GPT-4 demonstrates substantial empathy in addressing help-seekers' needs, it still lags behind human counselors. However, when guided by targeted prompts, GPT-4's emotional support capabilities improve markedly compared to its zero-prompt version. Notably, in handling emotional issues such as anger, fear, and disgust, prompted GPT-4 performs at a level comparable to human counselors. In summary, this study provides initial evidence of GPT-4's potential in emotional support and introduces an evaluation framework (initial-Evaluation, Enhancement, and re-Evaluation; EEE) that can be used to assess and optimize LLMs' abilities in mental health applications, offering insights into their role in supporting human mental health services.
全球心理健康服务的短缺引发了人们对利用生成式人工智能(AI)应对心理健康挑战的浓厚兴趣。本研究系统地评估了GPT-4的情感支持能力,并探索了通过有针对性的提示工程来提高其性能的方法。最初,采用自然语言处理和可解释的机器学习来开发一个预测模型,以评估GPT-4回复的同理心。特征敏感性分析确定了对同理心表现有显著影响的关键语言特征。然后,这些见解与同理心成分理论和帮助技能理论相结合,设计提示工程以增强GPT-4的情感支持能力。评估结果表明,虽然未加提示的GPT-4在满足求助者需求方面表现出相当的同理心,但仍落后于人类咨询师。然而,在有针对性的提示引导下,GPT-4的情感支持能力与其零提示版本相比有显著提高。值得注意的是,在处理愤怒、恐惧和厌恶等情绪问题时,加提示的GPT-4表现与人类咨询师相当。总之,本研究提供了GPT-4在情感支持方面潜力的初步证据,并引入了一个评估框架(初始评估、增强和重新评估;EEE),可用于评估和优化大型语言模型在心理健康应用中的能力,深入了解它们在支持人类心理健康服务中的作用。