Suppr超能文献

使用 YoloV5 和随机梯度提升检测无人机图像中的人类动作。

Detecting Human Actions in Drone Images Using YoloV5 and Stochastic Gradient Boosting.

机构信息

Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad 45550, Pakistan.

National Institute of Informatics, Tokyo 101-8430, Japan.

出版信息

Sensors (Basel). 2022 Sep 16;22(18):7020. doi: 10.3390/s22187020.

Abstract

Human action recognition and detection from unmanned aerial vehicles (UAVs), or drones, has emerged as a popular technical challenge in recent years, since it is related to many use case scenarios from environmental monitoring to search and rescue. It faces a number of difficulties mainly due to image acquisition and contents, and processing constraints. Since drones' flying conditions constrain image acquisition, human subjects may appear in images at variable scales, orientations, and occlusion, which makes action recognition more difficult. We explore low-resource methods for ML (machine learning)-based action recognition using a previously collected real-world dataset (the "Okutama-Action" dataset). This dataset contains representative situations for action recognition, yet is controlled for image acquisition parameters such as camera angle or flight altitude. We investigate a combination of object recognition and classifier techniques to support single-image action identification. Our architecture integrates YoloV5 with a gradient boosting classifier; the rationale is to use a scalable and efficient object recognition system coupled with a classifier that is able to incorporate samples of variable difficulty. In an ablation study, we test different architectures of YoloV5 and evaluate the performance of our method on Okutama-Action dataset. Our approach outperformed previous architectures applied to the Okutama dataset, which differed by their object identification and classification pipeline: we hypothesize that this is a consequence of both YoloV5 performance and the overall adequacy of our pipeline to the specificities of the Okutama dataset in terms of bias-variance tradeoff.

摘要

从无人飞行器(UAV)或无人机中识别和检测人类行为,近年来已成为一项热门的技术挑战,因为它与许多用例场景相关,从环境监测到搜索和救援。它面临着许多困难,主要是由于图像采集和内容以及处理的限制。由于无人机的飞行条件限制了图像采集,人类主体可能会以不同的比例、方向和遮挡出现在图像中,这使得动作识别更加困难。我们探索使用以前收集的真实世界数据集(“Okutama-Action”数据集)的基于机器学习(ML)的动作识别的低资源方法。该数据集包含了动作识别的代表性情况,但对图像采集参数(如相机角度或飞行高度)进行了控制。我们研究了一种结合目标识别和分类器技术的方法,以支持单图像动作识别。我们的架构将 YoloV5 与梯度提升分类器集成在一起;其基本原理是使用可扩展且高效的目标识别系统,结合能够包含不同难度样本的分类器。在消融研究中,我们测试了不同版本的 YoloV5 架构,并在 Okutama-Action 数据集上评估了我们方法的性能。与应用于 Okutama 数据集的不同目标识别和分类器流水线的先前架构相比,我们的方法表现更好:我们假设这是 YoloV5 性能和我们的流水线整体适应 Okutama 数据集的特定性(偏差方差权衡)的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验