Bobick A F
MIT Media Laboratory, Cambridge 02139, USA.
Philos Trans R Soc Lond B Biol Sci. 1997 Aug 29;352(1358):1257-65. doi: 10.1098/rstb.1997.0108.
This paper presents several approaches to the machine perception of motion and discusses the role and levels of knowledge in each. In particular, different techniques of motion understanding as focusing on one of movement, activity or action are described. Movements are the most atomic primitives, requiring no contextual or sequence knowledge to be recognized; movement is often addressed using either view-invariant or view-specific geometric techniques. Activity refers to sequences of movements or states, where the only real knowledge required is the statistics of the sequence; much of the recent work in gesture understanding falls within this category of motion perception. Finally, actions are larger-scale events, which typically include interaction with the environment and causal relationships; action understanding straddles the grey division between perception and cognition, computer vision and artificial intelligence. These levels are illustrated with examples drawn mostly from the group's work in understanding motion in video imagery. It is argued that the utility of such a division is that it makes explicit the representational competencies and manipulations necessary for perception.
本文介绍了几种用于机器运动感知的方法,并讨论了每种方法中知识的作用和层次。特别地,描述了作为专注于运动、活动或动作之一的不同运动理解技术。运动是最基本的元素,识别时不需要上下文或序列知识;运动通常使用视图不变或特定于视图的几何技术来处理。活动指的是运动或状态的序列,其中所需的唯一真正知识是序列的统计信息;手势理解方面的许多近期工作都属于这类运动感知。最后,动作是更大规模的事件,通常包括与环境的交互和因果关系;动作理解跨越了感知与认知、计算机视觉与人工智能之间的灰色地带。这些层次通过主要从该团队在理解视频图像中的运动方面的工作中选取的示例进行说明。有人认为,这种划分的作用在于它明确了感知所需的表示能力和操作。