National Engineering Research Center of Optical Instrumentation, Zhejiang University, Hangzhou 310058, China.
Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany.
Sensors (Basel). 2021 May 20;21(10):3558. doi: 10.3390/s21103558.
Scene sonification is a powerful technique to help Visually Impaired People (VIP) understand their surroundings. Existing methods usually perform sonification on the entire images of the surrounding scene acquired by a standard camera or on the priori static obstacles acquired by image processing algorithms on the RGB image of the surrounding scene. However, if all the information in the scene are delivered to VIP simultaneously, it will cause information redundancy. In fact, biological vision is more sensitive to moving objects in the scene than static objects, which is also the original intention of the event-based camera. In this paper, we propose a real-time sonification framework to help VIP understand the moving objects in the scene. First, we capture the events in the scene using an event-based camera and cluster them into multiple moving objects without relying on any prior knowledge. Then, sonification based on MIDI is enabled on these objects synchronously. Finally, we conduct comprehensive experiments on the scene video with sonification audio attended by 20 VIP and 20 Sighted People (SP). The results show that our method allows both participants to clearly distinguish the number, size, motion speed, and motion trajectories of multiple objects. The results show that our method is more comfortable to hear than existing methods in terms of aesthetics.
场景声化为帮助视障人士(VIP)理解周围环境提供了一种强大的技术。现有的方法通常对标准相机获取的周围场景的整个图像或对周围场景的 RGB 图像进行图像处理算法获取的 priori 静态障碍物进行声化。然而,如果同时将场景中的所有信息传递给 VIP,会导致信息冗余。事实上,生物视觉对场景中的运动物体比对静态物体更敏感,这也是事件相机的初衷。在本文中,我们提出了一个实时声化框架,以帮助 VIP 理解场景中的运动物体。首先,我们使用事件相机捕获场景中的事件,并在不依赖任何先验知识的情况下将它们聚类成多个运动物体。然后,对这些物体同步启用基于 MIDI 的声化。最后,我们对带有声化音频的场景视频进行了综合实验,有 20 名 VIP 和 20 名明眼人(SP)参加。结果表明,我们的方法允许参与者清楚地分辨出多个物体的数量、大小、运动速度和运动轨迹。结果表明,在美学方面,我们的方法比现有的方法更舒适。