使用基于隐喻和讽刺情景的精神科筛查工具评估大语言模型的能力

Evaluating Large Language Models' Ability Using a Psychiatric Screening Tool Based on Metaphor and Sarcasm Scenarios.

作者信息

Yakura Hiromu

机构信息

Max-Planck Institute for Human Development, 14195 Berlin, Germany.

出版信息

J Intell. 2024 Jul 21;12(7):70. doi: 10.3390/jintelligence12070070.

DOI:10.3390/jintelligence12070070

PMID:39057190

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11278383/

Abstract

Metaphors and sarcasm are precious fruits of our highly evolved social communication skills. However, children with the condition then known as Asperger syndrome are known to have difficulties in comprehending sarcasm, even if they possess adequate verbal IQs for understanding metaphors. Accordingly, researchers had employed a screening test that assesses metaphor and sarcasm comprehension to distinguish Asperger syndrome from other conditions with similar external behaviors (e.g., attention-deficit/hyperactivity disorder). This study employs a standardized test to evaluate recent large language models' (LLMs) understanding of nuanced human communication. The results indicate improved metaphor comprehension with increased model parameters; however, no similar improvement was observed for sarcasm comprehension. Considering that a human's ability to grasp sarcasm has been associated with the amygdala, a pivotal cerebral region for emotional learning, a distinctive strategy for training LLMs would be imperative to imbue them with the ability in a cognitively grounded manner.

摘要

隐喻和讽刺是我们高度进化的社交沟通技巧的宝贵成果。然而，患有当时被称为阿斯伯格综合征的儿童已知在理解讽刺方面存在困难，即使他们具备足够的语言智商来理解隐喻。因此，研究人员采用了一种评估隐喻和讽刺理解能力的筛查测试，以将阿斯伯格综合征与其他具有相似外在行为的病症（如注意力缺陷多动障碍）区分开来。本研究采用标准化测试来评估近期大语言模型（LLMs）对细微人类沟通的理解。结果表明，随着模型参数的增加，隐喻理解能力有所提高；然而，讽刺理解能力并未观察到类似的改善。鉴于人类理解讽刺的能力与杏仁核（情绪学习的关键脑区）有关，一种独特的训练大语言模型的策略将势在必行，以便以基于认知的方式赋予它们这种能力。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用基于隐喻和讽刺情景的精神科筛查工具评估大语言模型的能力

Evaluating Large Language Models' Ability Using a Psychiatric Screening Tool Based on Metaphor and Sarcasm Scenarios.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

使用基于隐喻和讽刺情景的精神科筛查工具评估大语言模型的能力

Evaluating Large Language Models' Ability Using a Psychiatric Screening Tool Based on Metaphor and Sarcasm Scenarios.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献