Sooksatra Sorn, Watcharapinchai Sitapa
National Electronic and Computer Technology Center, National Science and Technology Development Agency, Pathum Thani 12120, Thailand.
J Imaging. 2022 Jul 23;8(8):207. doi: 10.3390/jimaging8080207.
Temporal-action proposal generation (TAPG) is a well-known pre-processing of temporal-action localization and mainly affects localization performance on untrimmed videos. In recent years, there has been growing interest in proposal generation. Researchers have recently focused on anchor- and boundary-based methods for generating action proposals. The main purpose of this paper is to provide a comprehensive review of temporal-action proposal generation with network architectures and empirical results. The pre-processing step for input data is also discussed for network construction. The content of this paper was obtained from the research literature related to temporal-action proposal generation from 2012 to 2022 for performance evaluation and comparison. From several well-known databases, we used specific keywords to select 71 related studies according to their contributions and evaluation criteria. The contributions and methodologies are summarized and analyzed in a tabular form for each category. The result from state-of-the-art research was further analyzed to show its limitations and challenges for action proposal generation. TAPG performance in average recall ranges from 60% up to 78% in two TAPG benchmarks. In addition, several future potential research directions in this field are suggested based on the current limitations of the related studies.
时态动作建议生成(TAPG)是时态动作定位中一种广为人知的预处理方法,主要影响未修剪视频的定位性能。近年来,人们对建议生成的兴趣与日俱增。研究人员最近将重点放在基于锚点和基于边界的动作建议生成方法上。本文的主要目的是对具有网络架构和实证结果的时态动作建议生成进行全面综述。还讨论了网络构建中输入数据的预处理步骤。本文内容取自2012年至2022年与时态动作建议生成相关的研究文献,用于性能评估和比较。我们从几个知名数据库中,根据其贡献和评估标准,使用特定关键词选择了71项相关研究。对每一类别的贡献和方法以表格形式进行了总结和分析。对最先进研究的结果进行了进一步分析,以展示其在动作建议生成方面的局限性和挑战。在两个TAPG基准测试中,TAPG的平均召回率性能范围从60%到78%。此外,基于相关研究的当前局限性,提出了该领域未来几个潜在研究方向。