Jiao Licheng, Zhang Ruohan, Liu Fang, Yang Shuyuan, Hou Biao, Li Lingling, Tang Xu
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3195-3215. doi: 10.1109/TNNLS.2021.3053249. Epub 2022 Aug 3.
Video object detection, a basic task in the computer vision field, is rapidly evolving and widely used. In recent years, deep learning methods have rapidly become widespread in the field of video object detection, achieving excellent results compared with those of traditional methods. However, the presence of duplicate information and abundant spatiotemporal information in video data poses a serious challenge to video object detection. Therefore, in recent years, many scholars have investigated deep learning detection algorithms in the context of video data and have achieved remarkable results. Considering the wide range of applications, a comprehensive review of the research related to video object detection is both a necessary and challenging task. This survey attempts to link and systematize the latest cutting-edge research on video object detection with the goal of classifying and analyzing video detection algorithms based on specific representative models. The differences and connections between video object detection and similar tasks are systematically demonstrated, and the evaluation metrics and video detection performance of nearly 40 models on two data sets are presented. Finally, the various applications and challenges facing video object detection are discussed.
视频目标检测作为计算机视觉领域的一项基础任务,正在迅速发展并得到广泛应用。近年来,深度学习方法在视频目标检测领域迅速普及,与传统方法相比取得了优异的成果。然而,视频数据中存在的重复信息和丰富的时空信息给视频目标检测带来了严峻挑战。因此,近年来许多学者在视频数据背景下研究深度学习检测算法,并取得了显著成果。考虑到应用范围广泛,对视频目标检测相关研究进行全面综述既是一项必要的任务,也是一项具有挑战性的任务。本综述试图将视频目标检测的最新前沿研究联系起来并进行系统化,目的是基于特定的代表性模型对视频检测算法进行分类和分析。系统地展示了视频目标检测与类似任务之间的差异和联系,并给出了近40种模型在两个数据集上的评估指标和视频检测性能。最后,讨论了视频目标检测面临的各种应用和挑战。