基于视觉的无人机视频中的 HAR 研究：使用直方图和深度学习技术。

Vision-Based HAR in UAV Videos Using Histograms and Deep Learning Techniques.

机构信息

School of Computer Science and Engineering, VIT-AP University, Amaravati 522237, India.

出版信息

Sensors (Basel). 2023 Feb 25;23(5):2569. doi: 10.3390/s23052569.

DOI:10.3390/s23052569

PMID:36904773

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10007408/

Abstract

Activity recognition in unmanned aerial vehicle (UAV) surveillance is addressed in various computer vision applications such as image retrieval, pose estimation, object detection, object detection in videos, object detection in still images, object detection in video frames, face recognition, and video action recognition. In the UAV-based surveillance technology, video segments captured from aerial vehicles make it challenging to recognize and distinguish human behavior. In this research, to recognize a single and multi-human activity using aerial data, a hybrid model of histogram of oriented gradient (HOG), mask-regional convolutional neural network (Mask-RCNN), and bidirectional long short-term memory (Bi-LSTM) is employed. The HOG algorithm extracts patterns, Mask-RCNN extracts feature maps from the raw aerial image data, and the Bi-LSTM network exploits the temporal relationship between the frames for the underlying action in the scene. This Bi-LSTM network reduces the error rate to the greatest extent due to its bidirectional process. This novel architecture generates enhanced segmentation by utilizing the histogram gradient-based instance segmentation and improves the accuracy of classifying human activities using the Bi-LSTM approach. Experimental outcomes demonstrate that the proposed model outperforms the other state-of-the-art models and has achieved 99.25% accuracy on the YouTube-Aerial dataset.

摘要

在各种计算机视觉应用中，如图像检索、姿态估计、目标检测、视频中的目标检测、静态图像中的目标检测、视频帧中的目标检测、人脸识别和视频动作识别，都涉及到无人机 (UAV) 监控中的活动识别。在基于无人机的监控技术中，从空中车辆捕获的视频片段使得识别和区分人类行为变得具有挑战性。在这项研究中，为了使用空中数据识别单人或多人活动，采用了方向梯度直方图 (HOG)、掩模区域卷积神经网络 (Mask-RCNN) 和双向长短时记忆 (Bi-LSTM) 的混合模型。HOG 算法提取模式，Mask-RCNN 从原始空中图像数据中提取特征图，Bi-LSTM 网络利用场景中底层动作的帧之间的时间关系。由于其双向过程，该 Bi-LSTM 网络最大限度地降低了错误率。这种新颖的架构通过利用基于直方图梯度的实例分割来增强分割，并通过 Bi-LSTM 方法提高分类人类活动的准确性。实验结果表明，所提出的模型优于其他最先进的模型，在 YouTube-Aerial 数据集上达到了 99.25%的准确率。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于视觉的无人机视频中的 HAR 研究：使用直方图和深度学习技术。

Vision-Based HAR in UAV Videos Using Histograms and Deep Learning Techniques.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

基于视觉的无人机视频中的 HAR 研究：使用直方图和深度学习技术。

Vision-Based HAR in UAV Videos Using Histograms and Deep Learning Techniques.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献