Saleem Gulshan, Bajwa Usama Ijaz, Hammad Raza Rana, Alqahtani Fayez Hussain, Tolba Amr, Xia Feng
Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan.
Electronics and Power Engineering Department, Pakistan Navy Engineering College (PNEC), National University of Sciences and Technology (NUST), Karachi, Pakistan.
PeerJ Comput Sci. 2022 Oct 14;8:e1117. doi: 10.7717/peerj-cs.1117. eCollection 2022.
Smart surveillance is a difficult task that is gaining popularity due to its direct link to human safety. Today, many indoor and outdoor surveillance systems are in use at public places and smart cities. Because these systems are expensive to deploy, these are out of reach for the vast majority of the public and private sectors. Due to the lack of a precise definition of an anomaly, automated surveillance is a challenging task, especially when large amounts of data, such as 24/7 CCTV footage, must be processed. When implementing such systems in real-time environments, the high computational resource requirements for automated surveillance becomes a major bottleneck. Another challenge is to recognize anomalies accurately as achieving high accuracy while reducing computational cost is more challenging. To address these challenge, this research is based on the developing a system that is both efficient and cost effective. Although 3D convolutional neural networks have proven to be accurate, they are prohibitively expensive for practical use, particularly in real-time surveillance. In this article, we present two contributions: a resource-efficient framework for anomaly recognition problems and two-class and multi-class anomaly recognition on spatially augmented surveillance videos. This research aims to address the problem of computation overhead while maintaining recognition accuracy. The proposed Temporal based Anomaly Recognizer (TAR) framework combines a partial shift strategy with a 2D convolutional architecture-based model, namely MobileNetV2. Extensive experiments were carried out to evaluate the model's performance on the UCF Crime dataset, with MobileNetV2 as the baseline architecture; it achieved an accuracy of 88% which is 2.47% increased performance than available state-of-the-art. The proposed framework achieves 52.7% accuracy for multiclass anomaly recognition on the UCF Crime2Local dataset. The proposed model has been tested in real-time camera stream settings and can handle six streams simultaneously without the need for additional resources.
智能监控是一项艰巨的任务,因其与人类安全直接相关而越来越受到关注。如今,许多室内和室外监控系统在公共场所和智慧城市中得到应用。由于这些系统部署成本高昂,绝大多数公共和私营部门都难以企及。由于缺乏对异常的精确定义,自动监控是一项具有挑战性的任务,尤其是在必须处理大量数据(如全天候的闭路电视录像)时。在实时环境中实施此类系统时,自动监控对计算资源的高要求成为一个主要瓶颈。另一个挑战是准确识别异常,因为在降低计算成本的同时实现高精度更具挑战性。为应对这些挑战,本研究基于开发一个高效且经济高效的系统。尽管3D卷积神经网络已被证明是准确的,但它们在实际应用中成本过高,特别是在实时监控中。在本文中,我们提出了两项成果:一个用于异常识别问题的资源高效框架,以及在空间增强监控视频上的二分类和多分类异常识别。本研究旨在解决计算开销问题,同时保持识别精度。所提出的基于时间的异常识别器(TAR)框架将部分移位策略与基于二维卷积架构的模型(即MobileNetV2)相结合。我们进行了广泛的实验,以评估该模型在UCF Crime数据集上的性能,以MobileNetV2作为基线架构;它实现了88%的准确率,比现有最先进技术提高了2.47%的性能。所提出的框架在UCF Crime2Local数据集上的多分类异常识别准确率达到了52.7%。所提出的模型已经在实时摄像头流设置中进行了测试,并且可以在无需额外资源的情况下同时处理六个流。