使用神经网络模型进行人体检测与识别的无人机。

Unmanned aerial vehicles for human detection and recognition using neural-network model.

作者信息

Abbas Yawar, Al Mudawi Naif, Alabdullah Bayan, Sadiq Touseef, Algarni Asaad, Rahman Hameedur, Jalal Ahmad

机构信息

Faculty of Computer Science and AI, Air University, Islamabad, Pakistan.

Department of Computer Science, College of Computer Science and Information System, Najran University, Najran, Saudi Arabia.

出版信息

Front Neurorobot. 2024 Dec 4;18:1443678. doi: 10.3389/fnbot.2024.1443678. eCollection 2024.

DOI:10.3389/fnbot.2024.1443678

PMID:39698500

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11652500/

Abstract

INTRODUCTION

Recognizing human actions is crucial for allowing machines to understand and recognize human behavior, with applications spanning video based surveillance systems, human-robot collaboration, sports analysis systems, and entertainment. The immense diversity in human movement and appearance poses a significant challenge in this field, especially when dealing with drone-recorded (RGB) videos. Factors such as dynamic backgrounds, motion blur, occlusions, varying video capture angles, and exposure issues greatly complicate recognition tasks.

METHODS

In this study, we suggest a method that addresses these challenges in RGB videos captured by drones. Our approach begins by segmenting the video into individual frames, followed by preprocessing steps applied to these RGB frames. The preprocessing aims to reduce computational costs, optimize image quality, and enhance foreground objects while removing the background.

RESULT

This results in improved visibility of foreground objects while eliminating background noise. Next, we employ the YOLOv9 detection algorithm to identify human bodies within the images. From the grayscale silhouette, we extract the human skeleton and identify 15 important locations, such as the head, neck, shoulders (left and right), elbows, wrists, hips, knees, ankles, and hips (left and right), and belly button. By using all these points, we extract specific positions, angular and distance relationships between them, as well as 3D point clouds and fiducial points. Subsequently, we optimize this data using the kernel discriminant analysis (KDA) optimizer, followed by classification using a deep neural network (CNN). To validate our system, we conducted experiments on three benchmark datasets: UAV-Human, UCF, and Drone-Action.

DISCUSSION

On these datasets, our suggested model produced corresponding action recognition accuracies of 0.68, 0.75, and 0.83.

摘要

引言

识别人类行为对于使机器理解和识别人类行为至关重要，其应用涵盖基于视频的监控系统、人机协作、体育分析系统和娱乐领域。人类运动和外观的巨大多样性给该领域带来了重大挑战，尤其是在处理无人机录制的（RGB）视频时。动态背景、运动模糊、遮挡、不同的视频拍摄角度和曝光问题等因素使识别任务变得极为复杂。

方法

在本研究中，我们提出了一种方法来应对无人机拍摄的RGB视频中的这些挑战。我们的方法首先将视频分割成单个帧，然后对这些RGB帧应用预处理步骤。预处理旨在降低计算成本、优化图像质量、增强前景物体同时去除背景。

结果

这使得前景物体的可见性得到改善，同时消除了背景噪声。接下来，我们使用YOLOv9检测算法在图像中识别人体。从灰度轮廓中，我们提取人体骨骼并识别15个重要位置，如头部、颈部、肩膀（左右）、肘部、手腕、臀部、膝盖、脚踝以及臀部（左右）和肚脐。通过使用所有这些点，我们提取它们之间的特定位置、角度和距离关系，以及3D点云和基准点。随后，我们使用核判别分析（KDA）优化器对这些数据进行优化，然后使用深度神经网络（CNN）进行分类。为了验证我们的系统，我们在三个基准数据集上进行了实验：无人机 - 人类、UCF和无人机 - 动作。