Wu Chao, Li Shilong, Xie Tao, Wang Xiangdong, Zhou Jiali
School of Mathematical Sciences, Zhejiang University of Technology, Hangzhou 310023, China.
Sensors (Basel). 2024 Sep 11;24(18):5903. doi: 10.3390/s24185903.
With the rapid advancement of intelligent manufacturing technologies, the operating environments of modern robotic arms are becoming increasingly complex. In addition to the diversity of objects, there is often a high degree of similarity between the foreground and the background. Although traditional RGB-based object-detection models have achieved remarkable success in many fields, they still face the challenge of effectively detecting targets with textures similar to the background. To address this issue, we introduce the WoodenCube dataset, which contains over 5000 images of 10 different types of blocks. All images are densely annotated with object-level categories, bounding boxes, and rotation angles. Additionally, a new evaluation metric, Cube-mAP, is proposed to more accurately assess the detection performance of cube-like objects. In addition, we have developed a simple, yet effective, framework for WoodenCube, termed CS-SKNet, which captures strong texture features in the scene by enlarging the network's receptive field. The experimental results indicate that our CS-SKNet achieves the best performance on the WoodenCube dataset, as evaluated by the Cube-mAP metric. We further evaluate the CS-SKNet on the challenging DOTAv1.0 dataset, with the consistent enhancement demonstrating its strong generalization capability.
随着智能制造技术的快速发展,现代机械臂的操作环境日益复杂。除了物体的多样性外,前景与背景之间往往存在高度的相似性。尽管传统的基于RGB的目标检测模型在许多领域取得了显著成功,但它们在有效检测与背景纹理相似的目标方面仍面临挑战。为了解决这个问题,我们引入了WoodenCube数据集,其中包含10种不同类型方块的5000多张图像。所有图像都密集标注了物体级类别、边界框和旋转角度。此外,还提出了一种新的评估指标Cube-mAP,以更准确地评估类立方体物体的检测性能。此外,我们为WoodenCube开发了一个简单而有效的框架,称为CS-SKNet,它通过扩大网络的感受野来捕捉场景中的强纹理特征。实验结果表明,根据Cube-mAP指标评估,我们的CS-SKNet在WoodenCube数据集上取得了最佳性能。我们进一步在具有挑战性的DOTAv1.0数据集上评估了CS-SKNet,其性能的持续提升证明了它强大的泛化能力。