Miyazawa Kazuki, Horii Takato, Aoki Tatsuya, Nagai Takayuki
Graduate School of Engineering Science, Osaka University, Osaka, Japan.
Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan.
Front Robot AI. 2019 Nov 29;6:131. doi: 10.3389/frobt.2019.00131. eCollection 2019.
The manner in which humans learn, plan, and decide actions is a very compelling subject. Moreover, the mechanism behind high-level cognitive functions, such as action planning, language understanding, and logical thinking, has not yet been fully implemented in robotics. In this paper, we propose a framework for the simultaneously comprehension of concepts, actions, and language as a first step toward this goal. This can be achieved by integrating various cognitive modules and leveraging mainly multimodal categorization by using multilayered multimodal latent Dirichlet allocation (mMLDA). The integration of reinforcement learning and mMLDA enables actions based on understanding. Furthermore, the mMLDA, in conjunction with grammar learning and based on the Bayesian hidden Markov model (BHMM), allows the robot to verbalize its own actions and understand user utterances. We verify the potential of the proposed architecture through experiments using a real robot.
人类学习、规划和决定行动的方式是一个非常引人入胜的主题。此外,诸如行动规划、语言理解和逻辑思维等高级认知功能背后的机制在机器人技术中尚未得到充分实现。在本文中,我们提出了一个同时理解概念、行动和语言的框架,作为朝着这个目标迈出的第一步。这可以通过整合各种认知模块并主要利用多层多模态潜在狄利克雷分配(mMLDA)进行多模态分类来实现。强化学习与mMLDA的整合实现了基于理解的行动。此外,mMLDA结合语法学习并基于贝叶斯隐马尔可夫模型(BHMM),使机器人能够说出自己的行动并理解用户话语。我们通过使用真实机器人进行实验来验证所提出架构的潜力。