Puyt Richard W, Madsen Dag Øivind
Industrial Engineering and Business Information Systems (IEBIS), Faculty of Behavioural, Management and Social Sciences (BMS), University of Twente, Enschede, Netherlands.
Department of Business, Marketing and Law, USN School of Business, University of South-Eastern Norway, Hønefoss, Norway.
Front Artif Intell. 2024 May 3;7:1402047. doi: 10.3389/frai.2024.1402047. eCollection 2024.
In this study we test ChatGPT-4's ability to provide accurate information about the origins and evolution of SWOT analysis, perhaps the most widely used strategy tool in practice worldwide. ChatGPT-4 is tested for historical accuracy and hallucinations. The API is prompted using a Python script with a series of structured questions from an Excel file and the results are recorded in another Excel file and rated on a binary scale. Our findings present a nuanced view of ChatGPT-4's capabilities. We observe that while ChatGPT-4 demonstrates a high level of proficiency in describing and outlining the general concept of SWOT analysis, there are notable discrepancies when it comes to detailing its origins and evolution. These inaccuracies range from minor factual errors to more serious hallucinations that deviate from evidence in scholarly publications. However, we also find that ChatGPT-4 comes up with spontaneous historically accurate facts. Our interpretation of the result is that ChatGPT is largely trained on easily available websites and to a very limited extent has been trained on scholarly publications on SWOT analysis, especially when these are behind a paywall. We conclude with four propositions for future research.
在本研究中,我们测试了ChatGPT-4提供有关SWOT分析起源和演变准确信息的能力,SWOT分析可能是全球实践中使用最广泛的战略工具。我们对ChatGPT-4的历史准确性和幻觉进行了测试。使用Python脚本根据Excel文件中的一系列结构化问题提示该应用程序编程接口(API),结果记录在另一个Excel文件中,并以二元尺度进行评级。我们的研究结果对ChatGPT-4的能力给出了细致入微的看法。我们观察到,虽然ChatGPT-4在描述和概述SWOT分析的一般概念方面表现出很高的熟练程度,但在详细说明其起源和演变时存在明显差异。这些不准确之处从微小的事实错误到更严重的与学术出版物中的证据不符的幻觉。然而,我们也发现ChatGPT-4能自发给出历史准确的事实。我们对结果的解读是,ChatGPT主要是在易于访问的网站上进行训练的,在很大程度上并未针对SWOT分析的学术出版物进行训练,尤其是那些设置了付费墙的出版物。我们最后提出了四个未来研究的命题。