Suppr超能文献

使用GPT-4的无监督方法评估创意的新颖性、可行性和价值。

Assessing novelty, feasibility and value of creative ideas with an unsupervised approach using GPT-4.

作者信息

Kern Felix B, Wu Chien-Te, Chao Zenas C

机构信息

International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo, Tokyo, Japan.

出版信息

Br J Psychol. 2024 Jul 22. doi: 10.1111/bjop.12720.

Abstract

Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.

摘要

创造力由三个关键因素定义

新颖性、可行性和价值。虽然许多创造力测试主要关注新颖性,但它们往往忽视可行性和价值,从而限制了它们对现实世界创造力的反映。在本研究中,我们使用大型语言模型GPT-4来评估日语替代用途测试(AUT)中的这三个维度。我们采用众包评估方法,获取了30个问题项目的真实数据,并测试了各种GPT提示设计。我们的研究结果表明,在单个提示中要求提供多个回答,采用“先解释,后评分”的设计,既具有成本效益又准确(新颖性、可行性和价值的相关系数分别为0.62、0.59和0.33)。此外,我们的方法在评估新颖性时提供了与现有方法相当的准确性,而无需训练数据。我们还评估了其他模型,如GPT-4 Turbo、GPT-4 Omni和Claude 3.5 Sonnet。这些模型的可比性能证明了我们提示设计的普遍适用性。我们的结果为即时AUT评估提供了一个简单的平台,并为未来的方法学研究提供了有价值的真实数据。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验