Suppr超能文献

增强超市机器人交互:一种用于处理不同客户意图的公平多层次大语言模型对话界面。

Enhancing supermarket robot interaction: an equitable multi-level LLM conversational interface for handling diverse customer intents.

作者信息

Nandkumar Chandran, Peternel Luka

机构信息

Department of Cognitive Robotics, Delft University of Technology, Delft, Netherlands.

出版信息

Front Robot AI. 2025 Apr 29;12:1576348. doi: 10.3389/frobt.2025.1576348. eCollection 2025.

Abstract

This paper presents the design and evaluation of a comprehensive system to develop voice-based interfaces to support users in supermarkets. These interfaces enable shoppers to convey their needs through both generic and specific queries. Although customisable state-of-the-art systems like GPTs from OpenAI are easily accessible and adaptable, featuring low-code deployment with options for functional integration, they still face challenges such as increased response times and limitations in strategic control for tailored use cases and cost optimization. Motivated by the goal of crafting equitable and efficient conversational agents with a touch of personalisation, this study advances on two fronts: 1) a comparative analysis of four popular off-the-shelf speech recognition technologies to identify the most accurate model for different genders (male/female) and languages (English/Dutch) and 2) the development and evaluation of a novel multi-LLM supermarket chatbot framework, comparing its performance with a specialized GPT model powered by the GPT-4 Turbo, using the Artificial Social Agent Questionnaire (ASAQ) and qualitative participant feedback. Our findings reveal that OpenAI's Whisper leads in speech recognition accuracy between genders and languages and that our proposed multi-LLM chatbot architecture, which outperformed the benchmarked GPT model in performance, user satisfaction, user-agent partnership, and self-image enhancement, achieved statistical significance in these four key areas out of the 13 evaluated aspects that all showed improvements. The paper concludes with a simple method for supermarket robot navigation by mapping the final chatbot response to the correct shelf numbers to which the robot can plan sequential visits. Later, this enables the effective use of low-level perception, motion planning, and control capabilities for product retrieval and collection. We hope that this work encourages more efforts to use multiple specialized smaller models instead of always relying on a single powerful model.

摘要

本文介绍了一个综合系统的设计与评估,该系统用于开发基于语音的界面,以支持超市中的用户。这些界面使购物者能够通过通用和特定查询来表达他们的需求。尽管像OpenAI的GPTs这样的可定制的先进系统易于访问和适应,具有低代码部署以及功能集成选项,但它们仍然面临诸如响应时间增加、针对特定用例的战略控制限制以及成本优化等挑战。受打造具有一定个性化的公平高效对话代理这一目标的推动,本研究在两个方面取得了进展:1)对四种流行的现成语音识别技术进行比较分析,以确定针对不同性别(男性/女性)和语言(英语/荷兰语)最准确的模型;2)开发并评估一种新颖的多语言模型超市聊天机器人框架,并使用人工社会代理问卷(ASAQ)和定性的参与者反馈,将其性能与由GPT-4 Turbo驱动的专门GPT模型进行比较。我们的研究结果表明,OpenAI的Whisper在性别和语言之间的语音识别准确性方面领先,并且我们提出的多语言模型聊天机器人架构在性能、用户满意度、用户-代理合作关系和自我形象提升方面优于基准GPT模型,在所有显示出改进效果已评估的13个方面中的这四个关键领域达到了统计学显著性。本文最后提出了一种简单的超市机器人导航方法,即将最终的聊天机器人响应映射到机器人可以计划顺序访问的正确货架编号。随后,这能够有效地利用低级感知、运动规划和控制能力进行产品检索和收集。我们希望这项工作能鼓励更多人努力使用多个专门的较小模型,而不是总是依赖单个强大的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b728/12069059/beab3f8c3873/frobt-12-1576348-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验