多尺度特征融合模型在异常人类行为识别中的应用研究。

Research into the Applications of a Multi-Scale Feature Fusion Model in the Recognition of Abnormal Human Behavior.

机构信息

School of Information Science and Technology, Hebei Agricultural University, Baoding 071001, China.

Hebei Key Laboratory of Agricultural Big Data, Baoding 071001, China.

出版信息

Sensors (Basel). 2024 Aug 5;24(15):5064. doi: 10.3390/s24155064.

DOI:10.3390/s24155064

PMID:39124111

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11314932/

Abstract

Due to the increasing severity of aging populations in modern society, the accurate and timely identification of, and responses to, sudden abnormal behaviors of the elderly have become an urgent and important issue. In the current research on computer vision-based abnormal behavior recognition, most algorithms have shown poor generalization and recognition abilities in practical applications, as well as issues with recognizing single actions. To address these problems, an MSCS-DenseNet-LSTM model based on a multi-scale attention mechanism is proposed. This model integrates the MSCS (Multi-Scale Convolutional Structure) module into the initial convolutional layer of the DenseNet model to form a multi-scale convolution structure. It introduces the improved Inception X module into the Dense Block to form an Inception Dense structure, and gradually performs feature fusion through each Dense Block module. The CBAM attention mechanism module is added to the dual-layer LSTM to enhance the model's generalization ability while ensuring the accurate recognition of abnormal actions. Furthermore, to address the issue of single-action abnormal behavior datasets, the RGB image dataset RIDS (RGB image dataset) and the contour image dataset CIDS (contour image dataset) containing various abnormal behaviors were constructed. The experimental results validate that the proposed MSCS-DenseNet-LSTM model achieved an accuracy, sensitivity, and specificity of 98.80%, 98.75%, and 98.82% on the two datasets, and 98.30%, 98.28%, and 98.38%, respectively.

摘要

由于现代社会人口老龄化程度的不断加深，准确、及时地识别和应对老年人的突发异常行为已成为一个紧迫而重要的问题。在基于计算机视觉的异常行为识别研究中，大多数算法在实际应用中表现出较差的泛化和识别能力，并且存在识别单一动作的问题。针对这些问题，提出了一种基于多尺度注意力机制的 MSCS-DenseNet-LSTM 模型。该模型将 MSCS（多尺度卷积结构）模块集成到 DenseNet 模型的初始卷积层中，形成多尺度卷积结构。它将改进的 Inception X 模块引入 Dense Block 中，形成 Inception Dense 结构，并通过每个 Dense Block 模块逐步进行特征融合。将 CBAM 注意力机制模块添加到双层 LSTM 中，在确保异常动作准确识别的同时，增强模型的泛化能力。此外，为了解决单一动作异常行为数据集的问题，构建了包含各种异常行为的 RGB 图像数据集 RIDS（RGB 图像数据集）和轮廓图像数据集 CIDS（轮廓图像数据集）。实验结果验证了所提出的 MSCS-DenseNet-LSTM 模型在这两个数据集上的准确率、灵敏度和特异性分别为 98.80%、98.75%和 98.82%，以及在两个数据集上的准确率、灵敏度和特异性分别为 98.30%、98.28%和 98.38%。