IEEE Trans Image Process. 2014 Apr;23(4):1569-80. doi: 10.1109/TIP.2014.2302677.
This paper considers the recognition of realistic human actions in videos based on spatio-temporal interest points (STIPs). Existing STIP-based action recognition approaches operate on intensity representations of the image data. Because of this, these approaches are sensitive to disturbing photometric phenomena, such as shadows and highlights. In addition, valuable information is neglected by discarding chromaticity from the photometric representation. These issues are addressed by color STIPs. Color STIPs are multichannel reformulations of STIP detectors and descriptors, for which we consider a number of chromatic and invariant representations derived from the opponent color space. Color STIPs are shown to outperform their intensity-based counterparts on the challenging UCF sports, UCF11 and UCF50 action recognition benchmarks by more than 5% on average, where most of the gain is due to the multichannel descriptors. In addition, the results show that color STIPs are currently the single best low-level feature choice for STIP-based approaches to human action recognition.
本文考虑基于时空兴趣点 (STIP) 的视频中真实人类动作的识别。现有的基于 STIP 的动作识别方法是在图像数据的强度表示上进行操作。因此,这些方法对阴影和高光等干扰光度现象很敏感。此外,通过从光度表示中丢弃色度,会忽略有价值的信息。通过彩色 STIP 可以解决这些问题。彩色 STIP 是 STIP 检测器和描述符的多通道重新表述,我们考虑了许多源自对手色空间的色度和不变表示。彩色 STIP 在具有挑战性的 UCF 运动、UCF11 和 UCF50 动作识别基准上的性能优于其基于强度的对应物,平均超过 5%,其中大部分增益归因于多通道描述符。此外,结果表明,彩色 STIP 目前是基于 STIP 的人类动作识别方法的单一最佳低级特征选择。