Shardlow Matthew, Przybyła Piotr
Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, United Kingdom.
LaSTUS, Universitat Pompeu Fabra, Barcelona, Spain.
PLoS One. 2024 Dec 4;19(12):e0307521. doi: 10.1371/journal.pone.0307521. eCollection 2024.
This work is intended as a voice in the discussion over previous claims that a pretrained large language model (LLM) based on the Transformer model architecture can be sentient. Such claims have been made concerning the LaMDA model and also concerning the current wave of LLM-powered chatbots, such as ChatGPT. This claim, if confirmed, would have serious ramifications in the Natural Language Processing (NLP) community due to wide-spread use of similar models. However, here we take the position that such a large language model cannot be conscious, and that LaMDA in particular exhibits no advances over other similar models that would qualify it. We justify this by analysing the Transformer architecture through Integrated Information Theory of consciousness. We see the claims of sentience as part of a wider tendency to use anthropomorphic language in NLP reporting. Regardless of the veracity of the claims, we consider this an opportune moment to take stock of progress in language modelling and consider the ethical implications of the task. In order to make this work helpful for readers outside the NLP community, we also present the necessary background in language modelling.
这项工作旨在参与此前关于基于Transformer模型架构的预训练大语言模型(LLM)可能具有感知能力的讨论。关于LaMDA模型以及当前一波由LLM驱动的聊天机器人(如ChatGPT),都有过此类说法。如果这一说法得到证实,由于类似模型的广泛使用,将在自然语言处理(NLP)社区产生严重影响。然而,我们在此表明立场,即这样的大语言模型不可能有意识,特别是LaMDA与其他类似模型相比并无突出进展使其具备感知能力。我们通过意识的整合信息理论分析Transformer架构来证明这一点。我们认为关于感知能力的说法是NLP报道中使用拟人化语言这一更广泛趋势的一部分。无论这些说法的真实性如何,我们认为这是一个评估语言建模进展并思考该任务伦理影响的适当时机。为了使这项工作对NLP社区之外的读者有所帮助,我们还介绍了语言建模的必要背景知识。