Vicarious AI, CA, USA.
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aav3150.
Humans can infer concepts from image pairs and apply those in the physical world in a completely different setting, enabling tasks like IKEA assembly from diagrams. If robots could represent and infer high-level concepts, then it would notably improve their ability to understand our intent and to transfer tasks between different environments. To that end, we introduce a computational framework that replicates aspects of human concept learning. Concepts are represented as programs on a computer architecture consisting of a visual perception system, working memory, and action controller. The instruction set of this cognitive computer has commands for parsing a visual scene, directing gaze and attention, imagining new objects, manipulating the contents of a visual working memory, and controlling arm movement. Inferring a concept corresponds to inducing a program that can transform the input to the output. Some concepts require the use of imagination and recursion. Previously learned concepts simplify the learning of subsequent, more elaborate concepts and create a hierarchy of abstractions. We demonstrate how a robot can use these abstractions to interpret novel concepts presented to it as schematic images and then apply those concepts in very different situations. By bringing cognitive science ideas on mental imagery, perceptual symbols, embodied cognition, and deictic mechanisms into the realm of machine learning, our work brings us closer to the goal of building robots that have interpretable representations and common sense.
人类可以从图像对中推断概念,并将这些概念应用于完全不同的物理环境中,从而实现从图表中组装宜家家具等任务。如果机器人能够表示和推断高级概念,那么它们将显著提高理解我们意图和在不同环境中转移任务的能力。为此,我们引入了一个计算框架,该框架复制了人类概念学习的某些方面。概念被表示为计算机体系结构上的程序,该体系结构由视觉感知系统、工作记忆和动作控制器组成。这种认知计算机的指令集具有解析视觉场景、引导注视和注意力、想象新物体、操作视觉工作记忆的内容以及控制手臂运动的命令。推断一个概念相当于诱导一个可以将输入转换为输出的程序。有些概念需要想象力和递归的运用。之前学习的概念简化了后续更复杂概念的学习,并创建了一个抽象层次结构。我们展示了机器人如何使用这些抽象来解释以示意图像呈现给它的新概念,然后在非常不同的情况下应用这些概念。通过将认知科学关于心理意象、知觉符号、具身认知和指示机制的思想引入机器学习领域,我们的工作使我们更接近构建具有可解释表示和常识的机器人的目标。