Taniguchi Tadahiro, Yoshino Ryo, Takano Toshiaki
Emergent Systems Laboratory, College of Information Science and Engineering, Ritsumeikan University, Ksatsu Japan.
Adaptive Systems Laboratory, Department of Computer Science, Shizuoka Institute of Science and Technology, Fukuroi, Japan.
Front Neurorobot. 2018 May 22;12:22. doi: 10.3389/fnbot.2018.00022. eCollection 2018.
In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an active perception for MHDP method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback-Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive a Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The numerical experiment using the synthetic data shows that the proposed method can work appropriately even when the number of actions is large and a set of target objects involves objects categorized into multiple classes. The results support our theoretical outcomes.
在本文中,我们提出了一种基于多模态分层狄利克雷过程(MHDP)的用于识别物体类别的主动感知方法。MHDP使机器人能够利用多模态信息(例如视觉、听觉和触觉信息)来形成物体类别,这些信息可通过对物体执行动作来观察。然而,对目标物体执行许多动作需要很长时间。在实时场景中,即时间有限时,机器人必须确定对于识别目标物体最有效的动作集。我们提出了一种用于MHDP的主动感知方法,该方法使用信息增益(IG)最大化准则和懒惰贪婪算法。我们表明,IG最大化准则在某种意义上是最优的,即该准则等同于最小化最终识别状态与下一组动作后的识别状态之间的期望库尔贝克 - 莱布勒散度。然而,直接计算IG实际上是不可能的。因此,我们利用MHDP的一个性质推导出一种IG的蒙特卡罗近似方法。我们还表明,由于MHDP图形模型的结构,IG作为集合函数具有次模性和非递减性质。因此,IG最大化问题简化为一个次模最大化问题。这意味着贪婪算法和懒惰贪婪算法是有效的,并且其性能具有理论依据。我们使用上半身人形机器人进行了一项实验,并使用合成数据进行了另一项实验。实验结果表明,该方法使机器人能够选择一组动作,从而使其能够快速准确地识别目标物体。使用合成数据的数值实验表明,即使动作数量很大且一组目标物体涉及分类为多个类别的物体,所提出的方法也能适当地工作。这些结果支持了我们的理论成果。