利用声学特征和机器学习估计场景中无人机的数量

Estimation of number of unmanned aerial vehicles in a scene utilizing acoustic signatures and machine learning.

作者信息

A N Wilson, Jha Ajit, Kumar Abhinav, Cenkeramaddi Linga Reddy

机构信息

Department of ICT, University of Agder, Grimstad 4879, Norway.

Department of Engineering Science, University of Agder, Grimstad 4879, Norway.

出版信息

J Acoust Soc Am. 2023 Jul 1;154(1):533-546. doi: 10.1121/10.0020292.

DOI:10.1121/10.0020292

PMID:37497960

Abstract

With the exponential growth in unmanned aerial vehicle (UAV)-based applications, there is a need to ensure safe and secure operations. From a security perspective, detecting and localizing intruder UAVs is still a challenge. It is even more challenging to accurately estimate the number of intruder UAVs on the scene. In this work, we propose a simple acoustic-based technique to detect and estimate the number of UAVs. Our method utilizes acoustic signals generated from the motion of UAV motors and propellers. Acoustic signals are captured by flying an arbitrary number of ten UAVs in different combinations in an indoor setting. The recorded acoustic signals are trimmed, processed, and arranged to create an UAV audio dataset. The UAV audio dataset is subjected to time-frequency transformations to generate audio spectrogram images. The generated spectrogram images are then fed to a custom lightweight convolutional neural network (CNN) architecture to estimate the number of UAVs in the scene. Following training, the proposed model achieves an average test accuracy of 93.33% as compared to state-of-the-art benchmark models. Furthermore, the deployment feasibility of the proposed model is validated by running inference time calculations on edge computing devices, such as the Raspberry Pi 4, NVIDIA Jetson Nano, and NVIDIA Jetson AGX Xavier.

摘要

随着基于无人机（UAV）的应用呈指数级增长，需要确保安全可靠的操作。从安全角度来看，检测和定位入侵无人机仍然是一项挑战。准确估计现场入侵无人机的数量则更具挑战性。在这项工作中，我们提出了一种基于声学的简单技术来检测和估计无人机的数量。我们的方法利用无人机电机和螺旋桨运动产生的声学信号。通过在室内环境中以不同组合飞行任意数量的十架无人机来捕获声学信号。对记录的声学信号进行修剪、处理和整理，以创建一个无人机音频数据集。对无人机音频数据集进行时频变换，以生成音频频谱图图像。然后将生成的频谱图图像输入到一个定制的轻量级卷积神经网络（CNN）架构中，以估计场景中无人机的数量。经过训练，与现有基准模型相比，所提出的模型实现了93.33%的平均测试准确率。此外，通过在边缘计算设备（如树莓派4、英伟达Jetson Nano和英伟达Jetson AGX Xavier）上运行推理时间计算，验证了所提出模型的部署可行性。