Zhao Beigeng, Song Rui
College of Public Security Information Technology and Intelligence, Criminal Investigation Police University of China, Shenyang, China.
Sci Rep. 2024 Feb 27;14(1):4765. doi: 10.1038/s41598-024-55570-z.
The high-altitude imaging capabilities of Unmanned Aerial Vehicles (UAVs) offer an effective solution for maritime Search and Rescue (SAR) operations. In such missions, the accurate identification of boats, personnel, and objects within images is crucial. While object detection models trained on general image datasets can be directly applied to these tasks, their effectiveness is limited due to the unique challenges posed by the specific characteristics of maritime SAR scenarios. Addressing this challenge, our study leverages the large-scale benchmark dataset SeaDronesSee, specific to UAV-based maritime SAR, to analyze and explore the unique attributes of image data in this scenario. We identify the need for optimization in detecting specific categories of difficult-to-detect objects within this context. Building on this, an anchor box optimization strategy is proposed based on clustering analysis, aimed at enhancing the performance of the renowned two-stage object detection models in this specialized task. Experiments were conducted to validate the proposed anchor box optimization method and to explore the underlying reasons for its effectiveness. The experimental results show our optimization method achieved a 45.8% and a 10% increase in average precision over the default anchor box configurations of torchvision and the SeaDronesSee official sample code configuration respectively. This enhancement was particularly evident in the model's significantly improved ability to detect swimmers, floaters, and life jackets on boats within the SeaDronesSee dataset's SAR scenarios. The methods and findings of this study are anticipated to provide the UAV-based maritime SAR research community with valuable insights into data characteristics and model optimization, offering a meaningful reference for future research.
无人机(UAV)的高空成像能力为海上搜索与救援(SAR)行动提供了一种有效的解决方案。在这类任务中,准确识别图像中的船只、人员和物体至关重要。虽然在通用图像数据集上训练的目标检测模型可以直接应用于这些任务,但由于海上SAR场景的特定特征带来的独特挑战,其有效性受到限制。为应对这一挑战,我们的研究利用了专门针对基于无人机的海上SAR的大规模基准数据集SeaDronesSee,来分析和探索该场景下图像数据的独特属性。我们确定了在此背景下检测特定类别难以检测物体时进行优化的必要性。在此基础上,提出了一种基于聚类分析的锚框优化策略,旨在提高著名的两阶段目标检测模型在这项专门任务中的性能。进行了实验以验证所提出的锚框优化方法,并探索其有效性的潜在原因。实验结果表明,我们的优化方法分别比torchvision的默认锚框配置和SeaDronesSee官方示例代码配置的平均精度提高了45.8%和10%。这种提升在模型检测SeaDronesSee数据集中SAR场景下船只上的游泳者、漂浮物和救生衣的能力显著提高方面尤为明显。预计本研究的方法和结果将为基于无人机的海上SAR研究社区提供有关数据特征和模型优化的宝贵见解,为未来研究提供有意义的参考。