School of Information Science and Technology, North China University of Technology, Beijing 100144, China.
Sensors (Basel). 2023 Jan 26;23(3):1385. doi: 10.3390/s23031385.
As the monitor probes are used more and more widely these days, the task of detecting abnormal behaviors in surveillance videos has gained widespread attention. The generalization ability and parameter overhead of the model affect how accurate the detection result is. To deal with the poor generalization ability and high parameter overhead of the model in existing anomaly detection methods, we propose a three-dimensional multi-branch convolutional fusion network, named "Branch-Fusion Net". The network is designed with a multi-branch structure not only to significantly reduce parameter overhead but also to improve the generalization ability by understanding the input feature map from different perspectives. To ignore useless features during the model training, we propose a simple yet effective Channel Spatial Attention Module (CSAM), which sequentially focuses attention on key channels and spatial feature regions to suppress useless features and enhance important features. We combine the Branch-Fusion Net and the CSAM as a local feature extraction network and use the Bi-Directional Gated Recurrent Unit (Bi-GRU) to extract global feature information. The experiments are validated on a self-built Crimes-mini dataset, and the accuracy of anomaly detection in surveillance videos reaches 93.55% on the test set. The result shows that the model proposed in the paper significantly improves the accuracy of anomaly detection in surveillance videos with low parameter overhead.
由于现今监测探头的使用越来越广泛,对监控视频中异常行为的检测任务受到了广泛关注。模型的泛化能力和参数开销会影响检测结果的准确性。针对现有异常检测方法中模型泛化能力差、参数开销大的问题,我们提出了一种三维多分支卷积融合网络,命名为“Branch-Fusion Net”。该网络采用多分支结构,不仅显著降低了参数开销,还通过从不同角度理解输入特征图来提高了泛化能力。为了在模型训练过程中忽略无用的特征,我们提出了一种简单而有效的通道空间注意模块(CSAM),它依次关注关键通道和空间特征区域,以抑制无用特征并增强重要特征。我们将 Branch-Fusion Net 和 CSAM 结合作为局部特征提取网络,并使用双向门控循环单元(Bi-GRU)提取全局特征信息。在自建的犯罪行为-mini 数据集上进行实验验证,在测试集上,监控视频中异常检测的准确率达到 93.55%。结果表明,该文提出的模型在低参数开销的情况下,显著提高了监控视频中异常检测的准确率。