Stallman Jane K, Haeffel Gerald J
Department of Psychology, University of Notre Dame, Notre Dame, IN, 46656, USA.
Sci Rep. 2025 Jun 5;15(1):19782. doi: 10.1038/s41598-025-03740-y.
We tested if GPT could predict changes in depressive symptoms using participants' (n = 930) causal explanations for negative life events. Results showed that 2 of 30 GPT prompts yielded output that could reliably predict changes in future depressive symptoms; but this output was not a better predictor than the traditional paper-and-pencil measure of cognitive risk for depression (Cognitive Style Questionnaire). These findings highlight potential limitations of large language models like GPT. Human thought is complex, and language may not accurately represent people's internal cognitive processes. In this case, participants' written explanations for negative life events did not contain meaningful information that could be used for differentiation (or was indicative of some latent construct). We found that people could generate equally negative causal explanations for negative events yet hold different beliefs about the changeability of those causes. Our results support the hypothesis that it is the perceived changeability, not the overall negativity, of causal beliefs that determines risk for depressive symptoms. GPT cannot yet discern this changeability as well as a paper-and-pencil questionnaire.
我们使用参与者(n = 930)对负面生活事件的因果解释,测试了GPT是否能够预测抑郁症状的变化。结果显示,30个GPT提示中有2个产生的输出能够可靠地预测未来抑郁症状的变化;但该输出并不比传统的纸笔式抑郁认知风险测量方法(认知风格问卷)更具预测性。这些发现凸显了像GPT这样的大语言模型的潜在局限性。人类思维是复杂的,语言可能无法准确代表人们的内部认知过程。在这种情况下,参与者对负面生活事件的书面解释并未包含可用于区分的有意义信息(或并未表明某种潜在结构)。我们发现,人们可以对负面事件产生同样负面的因果解释,但对这些原因的可变性持有不同信念。我们的结果支持了这样一种假设,即决定抑郁症状风险的是因果信念的感知可变性,而非总体负面性。GPT目前还无法像纸笔问卷那样辨别这种可变性。