Ye Yangqing, Ma Xiaolon, Zhou Xuanyi, Bao Guanjun, Wan Weiwei, Cai Shibo
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou 310023, China.
College of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China.
Sensors (Basel). 2023 Nov 28;23(23):9482. doi: 10.3390/s23239482.
Home service robots operating indoors, such as inside houses and offices, require the real-time and accurate identification and location of target objects to perform service tasks efficiently. However, images captured by visual sensors while in motion states usually contain varying degrees of blurriness, presenting a significant challenge for object detection. In particular, daily life scenes contain small objects like fruits and tableware, which are often occluded, further complicating object recognition and positioning. A dynamic and real-time object detection algorithm is proposed for home service robots. This is composed of an image deblurring algorithm and an object detection algorithm. To improve the clarity of motion-blurred images, the DA-Multi-DCGAN algorithm is proposed. It comprises an embedded dynamic adjustment mechanism and a multimodal multiscale fusion structure based on robot motion and surrounding environmental information, enabling the deblurring processing of images that are captured under different motion states. Compared with DeblurGAN, DA-Multi-DCGAN had a 5.07 improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.022 improvement in Structural Similarity (SSIM). An AT-LI-YOLO method is proposed for small and occluded object detection. Based on depthwise separable convolution, this method highlights key areas and integrates salient features by embedding the attention module in the AT-Resblock to improve the sensitivity and detection precision of small objects and partially occluded objects. It also employs a lightweight network unit Lightblock to reduce the network's parameters and computational complexity, which improves its computational efficiency. Compared with YOLOv3, the mean average precision (mAP) of AT-LI-YOLO increased by 3.19%, and the detection precision of small objects, such as apples and oranges and partially occluded objects, increased by 19.12% and 29.52%, respectively. Moreover, the model inference efficiency had a 7 ms reduction in processing time. Based on the typical home activities of older people and children, the dataset Grasp-17 was established for the training and testing of the proposed method. Using the TensorRT neural network inference engine of the developed service robot prototype, the proposed dynamic and real-time object detection algorithm required 29 ms, which meets the real-time requirement of smooth vision.
在室内(如房屋和办公室内)运行的家庭服务机器人需要实时、准确地识别和定位目标物体,以便高效地执行服务任务。然而,视觉传感器在运动状态下捕获的图像通常包含不同程度的模糊,这给目标检测带来了重大挑战。特别是,日常生活场景中包含水果和餐具等小物体,这些物体经常被遮挡,进一步使目标识别和定位变得复杂。本文提出了一种适用于家庭服务机器人的动态实时目标检测算法。该算法由图像去模糊算法和目标检测算法组成。为了提高运动模糊图像的清晰度,提出了DA-Multi-DCGAN算法。它包括一个嵌入式动态调整机制和一个基于机器人运动和周围环境信息的多模态多尺度融合结构,能够对在不同运动状态下捕获的图像进行去模糊处理。与DeblurGAN相比,DA-Multi-DCGAN的峰值信噪比(PSNR)提高了5.07,结构相似性(SSIM)提高了0.022。本文还提出了一种用于小物体和被遮挡物体检测的AT-LI-YOLO方法。该方法基于深度可分离卷积,通过将注意力模块嵌入到AT-Resblock中突出关键区域并整合显著特征,提高小物体和部分被遮挡物体的检测灵敏度和精度。它还采用了轻量级网络单元Lightblock来减少网络参数和计算复杂度,从而提高计算效率。与YOLOv3相比,AT-LI-YOLO的平均精度均值(mAP)提高了3.19%,苹果和橙子等小物体以及部分被遮挡物体的检测精度分别提高了19.12%和29.52%。此外,模型推理效率的处理时间减少了7毫秒。基于老年人和儿童的典型家庭活动,建立了数据集Grasp-17用于所提方法的训练和测试。使用所开发服务机器人原型的TensorRT神经网络推理引擎,所提出的动态实时目标检测算法需要29毫秒,满足了流畅视觉的实时要求。