Taha Roqaia Adel, Youssif Aliaa Abdel-Halim, Fouad Mohamed Mostafa
College of Computing and Information Technology, Arab Academy for Science, Technology and Maritime Transport (AASTMT), Smart Village, Cairo, Egypt.
Sci Rep. 2024 Nov 13;14(1):27868. doi: 10.1038/s41598-024-78414-2.
Video surveillance faces challenges due to the need for improved anomalous event recognition techniques for human activity recognition. Growing security concerns make standard CCTV systems insufficient because of high monitoring costs and operator exhaustion. Therefore, automated security systems with real-time event recognition are essential. This research introduces a semantic key frame extraction algorithm based on action recognition to minimize frame volume big video data. This approach has not been previously applied with ResNet50, VGG19, EfficientNetB7, and ViT_b16 models for recognizing anomalous events in surveillance videos. The findings demonstrate the effectiveness of this method in achieving high accuracy rates. The proposed method addresses the challenges posed by large volumes of frames generated by surveillance videos, requiring effective processing techniques. A large number of videos from the UCF-Crime dataset were used for proposed model evaluation, including both abnormal and normal videos during the training and testing phase. EfficientNetB7 achieved 86.34% accuracy, VGG19 reached 87.90%, ResNet50 attained 90.46%, and ViT_b16 excelled with 95.87% accuracy. Compared to state-of-the-art models from other studies, the transformer model (ViT_b16) outperformed these algorithms, demonstrating significant improvements in recognizing anomalous events.
由于需要改进用于人类活动识别的异常事件识别技术,视频监控面临挑战。日益增长的安全担忧使得标准的闭路电视系统不足,因为监控成本高且操作员容易疲劳。因此,具有实时事件识别功能的自动化安全系统至关重要。本研究引入了一种基于动作识别的语义关键帧提取算法,以最小化大视频数据的帧数量。这种方法此前尚未与ResNet50、VGG19、EfficientNetB7和ViT_b16模型一起应用于识别监控视频中的异常事件。研究结果证明了该方法在实现高精度方面的有效性。所提出的方法解决了监控视频生成的大量帧带来的挑战,这需要有效的处理技术。在训练和测试阶段,使用了来自UCF-Crime数据集的大量视频对所提出的模型进行评估,包括异常视频和正常视频。EfficientNetB7的准确率达到86.34%,VGG19达到87.90%,ResNet50达到90.46%,而ViT_b16表现出色,准确率为95.87%。与其他研究中的现有模型相比,变压器模型(ViT_b16)优于这些算法,在识别异常事件方面有显著改进。