一种基于视觉的无人机系统理解未知环境的机器学习方法。

A Machine Learning Method for Vision-Based Unmanned Aerial Vehicle Systems to Understand Unknown Environments.

作者信息

Zhang Tianyao, Hu Xiaoguang, Xiao Jin, Zhang Guofeng

机构信息

School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China.

ShenYuan Honors College of Beihang University, Beihang University, Beijing 100191, China.

出版信息

Sensors (Basel). 2020 Jun 7;20(11):3245. doi: 10.3390/s20113245.

DOI:10.3390/s20113245

PMID:32517309

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7308845/

Abstract

What makes unmanned aerial vehicles (UAVs) intelligent is their capability of sensing and understanding new unknown environments. Some studies utilize computer vision algorithms like Visual Simultaneous Localization and Mapping (VSLAM) and Visual Odometry (VO) to sense the environment for pose estimation, obstacles avoidance and visual servoing. However, understanding the new environment (i.e., make the UAV recognize generic objects) is still an essential scientific problem that lacks a solution. Therefore, this paper takes a step to understand the items in an unknown environment. The aim of this research is to enable the UAV with basic understanding capability for a high-level UAV flock application in the future. Specially, firstly, the proposed understanding method combines machine learning and traditional algorithm to understand the unknown environment through RGB images; secondly, the You Only Look Once (YOLO) object detection system is integrated (based on TensorFlow) in a smartphone to perceive the position and category of 80 classes of objects in the images; thirdly, the method makes the UAV more intelligent and liberates the operator from labor; fourthly, detection accuracy and latency in working condition are quantitatively evaluated, and properties of generality (can be used in various platforms), transportability (easily deployed from one platform to another) and scalability (easily updated and maintained) for UAV flocks are qualitatively discussed. The experiments suggest that the method has enough accuracy to recognize various objects with high computational speed, and excellent properties of generality, transportability and scalability.

摘要

使无人机（UAV）具备智能的是它们感知和理解新的未知环境的能力。一些研究利用诸如视觉同步定位与建图（VSLAM）和视觉里程计（VO）等计算机视觉算法来感知环境，以进行姿态估计、避障和视觉伺服控制。然而，理解新环境（即让无人机识别通用物体）仍然是一个缺乏解决方案的重要科学问题。因此，本文朝着理解未知环境中的物品迈出了一步。这项研究的目的是使无人机在未来具备基本的理解能力，以用于高级无人机群应用。具体而言，首先，所提出的理解方法将机器学习与传统算法相结合，通过RGB图像来理解未知环境；其次，将仅看一次（YOLO）目标检测系统（基于TensorFlow）集成到智能手机中，以感知图像中80类物体的位置和类别；第三，该方法使无人机更智能，并将操作员从繁重劳动中解放出来；第四，对工作状态下的检测精度和延迟进行了定量评估，并定性地讨论了无人机群的通用性（可用于各种平台）、可运输性（易于从一个平台部署到另一个平台）和可扩展性（易于更新和维护）等特性。实验表明，该方法具有足够的精度以高计算速度识别各种物体，并且具有出色的通用性、可运输性和可扩展性。