Department of Intelligence Science and Technology, Matsuyama Laboratory, Graduate School of Informatics, Kyoto University, Kyoto, Japan.
IEEE Trans Pattern Anal Mach Intell. 2012 Aug;34(8):1645-57. doi: 10.1109/TPAMI.2011.258.
This paper presents a novel approach that achieves 3D video understanding. 3D video consists of a stream of 3D models of subjects in motion. The acquisition of long sequences requires large storage space (2 GB for 1 min). Moreover, it is tedious to browse data sets and extract meaningful information. We propose the topology dictionary to encode and describe 3D video content. The model consists of a topology-based shape descriptor dictionary which can be generated from either extracted patterns or training sequences. The model relies on 1) topology description and classification using Reeb graphs, and 2) a Markov motion graph to represent topology change states. We show that the use of Reeb graphs as the high-level topology descriptor is relevant. It allows the dictionary to automatically model complex sequences, whereas other strategies would require prior knowledge on the shape and topology of the captured subjects. Our approach serves to encode 3D video sequences, and can be applied for content-based description and summarization of 3D video sequences. Furthermore, topology class labeling during a learning process enables the system to perform content-based event recognition. Experiments were carried out on various 3D videos. We showcase an application for 3D video progressive summarization using the topology dictionary.
本文提出了一种新颖的方法来实现 3D 视频理解。3D 视频由运动主体的 3D 模型流组成。长序列的获取需要大量的存储空间(1 分钟 2GB)。此外,浏览数据集和提取有意义的信息也很繁琐。我们提出了拓扑字典来对 3D 视频内容进行编码和描述。该模型由基于拓扑的形状描述符字典组成,可以从提取的模式或训练序列中生成。该模型依赖于 1)使用 Reeb 图进行拓扑描述和分类,以及 2)使用马尔可夫运动图来表示拓扑变化状态。我们表明,使用 Reeb 图作为高级拓扑描述符是相关的。它允许字典自动对复杂的序列进行建模,而其他策略则需要对捕获主体的形状和拓扑有先验知识。我们的方法用于对 3D 视频序列进行编码,并可应用于 3D 视频序列的基于内容的描述和摘要。此外,在学习过程中进行拓扑分类标记可以使系统能够执行基于内容的事件识别。在各种 3D 视频上进行了实验。我们展示了使用拓扑字典进行 3D 视频渐进式摘要的应用。