Jian Muwei, Yu Hui
School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan 250014, China.
School of Creative Technologies, University of Portsmouth, Portsmouth 200021, UK.
Fundam Res. 2023 Aug 10;5(1):354-359. doi: 10.1016/j.fmre.2023.08.001. eCollection 2025 Jan.
In the process of image understanding, the human visual system (HVS) performs multiscale analysis on various objects. HVS primarily focuses on marginally conspicuous image patches located within or around distinct objects rather than scanning the image pixels point by point. Inspired by the HVS mechanism, in this paper, we aimed to describe and exploit multiscale decomposition-based patch detection models for automatic visual feature representation and object localization in images. Our investigation into mimicking and modeling the HVS to capture conspicuous sparse patches and their spatial distribution clues makes a profound contribution to the automatic comprehension and characterization of images by machines. This study demonstrates that the sparse patch-based visual representation with spatial center cues is intrinsically tolerant to object positioning and understanding beyond object variations in spatial position, multiresolution, and chrominance, which has significant implications for many vision-based automatic object grabbing and perception applications, such as robotics, human‒machine interaction, and unmanned aerial vehicles (UAVs).
在图像理解过程中,人类视觉系统(HVS)对各种物体进行多尺度分析。HVS主要关注位于不同物体内部或周围的边缘显著图像块,而不是逐点扫描图像像素。受HVS机制的启发,在本文中,我们旨在描述和利用基于多尺度分解的块检测模型,用于图像中的自动视觉特征表示和物体定位。我们对模仿和建模HVS以捕获显著稀疏块及其空间分布线索的研究,为机器对图像的自动理解和表征做出了深远贡献。这项研究表明,具有空间中心线索的基于稀疏块的视觉表示本质上能够容忍物体在空间位置、多分辨率和色度方面的变化,这对许多基于视觉的自动物体抓取和感知应用具有重要意义,如机器人技术、人机交互和无人机(UAV)。