Department of Computer Science, Graduate School, Kyonggi University, 154-42 Gwanggyosan-ro Yeongtong-gu, Suwon-si 16227, Korea.
Department of Computer Science, Kyonggi University, 154-42 Gwanggyosan-ro Yeongtong-gu, Suwon-si 16227, Korea.
Sensors (Basel). 2019 Mar 3;19(5):1085. doi: 10.3390/s19051085.
This paper proposes a novel deep neural network model for solving the spatio-temporal-action-detection problem, by localizing all multiple-action regions and classifying the corresponding actions in an untrimmed video. The proposed model uses a spatio-temporal region proposal method to effectively detect multiple-action regions. First, in the temporal region proposal, anchor boxes were generated by targeting regions expected to potentially contain actions. Unlike the conventional temporal region proposal methods, the proposed method uses a complementary two-stage method to effectively detect the temporal regions of the respective actions occurring asynchronously. In addition, to detect a principal agent performing an action among the people appearing in a video, the spatial region proposal process was used. Further, coarse-level features contain comprehensive information of the whole video and have been frequently used in conventional action-detection studies. However, they cannot provide detailed information of each person performing an action in a video. In order to overcome the limitation of coarse-level features, the proposed model additionally learns fine-level features from the proposed action tubes in the video. Various experiments conducted using the LIRIS-HARL and UCF-10 datasets confirm the high performance and effectiveness of the proposed deep neural network model.
本文提出了一种新的深度神经网络模型,用于解决时空动作检测问题,通过在未修剪的视频中定位所有多动作区域并对相应的动作进行分类。所提出的模型使用时空区域提议方法来有效地检测多动作区域。首先,在时间区域提议中,通过针对可能包含动作的预期区域生成锚框。与传统的时间区域提议方法不同,所提出的方法使用互补的两阶段方法来有效地检测异步发生的各个动作的时间区域。此外,为了在视频中出现的人中检测执行动作的主要代理,使用了空间区域提议过程。进一步地,粗粒度特征包含整个视频的综合信息,并且已经在传统的动作检测研究中经常使用。然而,它们不能提供视频中执行动作的每个人的详细信息。为了克服粗粒度特征的局限性,所提出的模型还从视频中的提议动作管中学习细粒度特征。使用 LIRIS-HARL 和 UCF-10 数据集进行的各种实验证实了所提出的深度神经网络模型的高性能和有效性。