文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

The Artificial Intelligence (AI) research community has used ad-hoc benchmarks to measure the "" level of Large Language Models (LLMs). In humans, intelligence is closely linked to the functional integrity of the prefrontal lobes, which are essential for higher-order cognitive processes. Previous research has found that LLMs struggle with cognitive tasks that rely on these prefrontal functions, highlighting a significant challenge in replicating human-like intelligence. In December 2022, OpenAI released ChatGPT, a new chatbot based on the GPT-3.5 model that quickly gained popularity for its impressive ability to understand and respond to human instructions, suggesting a significant step towards intelligent behaviour in AI. Therefore, to rigorously investigate LLMs' level of "," we evaluated the GPT-3.5 and GPT-4 versions through a neuropsychological assessment using tests in the Italian language routinely employed to assess prefrontal functioning in humans. The same tests were also administered to Claude2 and Llama2 to verify whether similar language models perform similarly in prefrontal tests. When using human performance as a reference, GPT-3.5 showed inhomogeneous results on prefrontal tests, with some tests well above average, others in the lower range, and others frankly impaired. Specifically, we have identified poor planning abilities and difficulty in recognising semantic absurdities and understanding others' intentions and mental states. Claude2 exhibited a similar pattern to GPT-3.5, while Llama2 performed poorly in almost all tests. These inconsistent profiles highlight how LLMs' emergent abilities do not yet mimic human cognitive functioning. The sole exception was GPT-4, which performed within the normative range for all the tasks except planning. Furthermore, we showed how standardised neuropsychological batteries developed to assess human cognitive functions may be suitable for challenging LLMs' performance.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。

Challenging large language models' "" with human tools: A neuropsychological investigation in Italian language on prefrontal functioning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。

Challenging large language models' "" with human tools: A neuropsychological investigation in Italian language on prefrontal functioning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。 注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。

Challenging large language models' "" with human tools: A neuropsychological investigation in Italian language on prefrontal functioning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。 注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。

Challenging large language models' "" with human tools: A neuropsychological investigation in Italian language on prefrontal functioning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。

运用人类工具挑战大型语言模型的“”：一项关于意大利语前额叶功能的神经心理学研究。注：原文中“Challenging large language models' "" with human tools”这里双引号里内容缺失，翻译可能不太准确，需结合完整原文进一步理解。