Jardim Sandra, António João, Mora Carlos
Smart Cities Research Center, Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal.
Techframe-Information Systems, SA, 2785-338 São Domingos de Rana, Portugal.
J Imaging. 2022 Jun 8;8(6):163. doi: 10.3390/jimaging8060163.
With a wide range of applications, image segmentation is a complex and difficult preprocessing step that plays an important role in automatic visual systems, which accuracy impacts, not only on segmentation results, but directly affects the effectiveness of the follow-up tasks. Despite the many advances achieved in the last decades, image segmentation remains a challenging problem, particularly, the segmenting of color images due to the diverse inhomogeneities of color, textures and shapes present in the descriptive features of the images. In trademark graphic images segmentation, beyond these difficulties, we must also take into account the high noise and low resolution, which are often present. Trademark graphic images can also be very heterogeneous with regard to the elements that make them up, which can be overlapping and with varying lighting conditions. Due to the immense variation encountered in corporate logos and trademark graphic images, it is often difficult to select a single method for extracting relevant image regions in a way that produces satisfactory results. Many of the hybrid approaches that integrate the Watershed and K-Means algorithms involve processing very high quality and visually similar images, such as medical images, meaning that either approach can be tweaked to work on images that follow a certain pattern. Trademark images are totally different from each other and are usually fully colored. Our system solves this difficulty given it is a generalized implementation designed to work in most scenarios, through the use of customizable parameters and completely unbiased for an image type. In this paper, we propose a hybrid approach to Image Region Extraction that focuses on automated region proposal and segmentation techniques. In particular, we analyze popular techniques such as K-Means Clustering and Watershedding and their effectiveness when deployed in a hybrid environment to be applied to a highly variable dataset. The proposed system consists of a multi-stage algorithm that takes as input an RGB image and produces multiple outputs, corresponding to the extracted regions. After preprocessing steps, a K-Means function with random initial centroids and a user-defined value for is executed over the RGB image, generating a gray-scale segmented image, to which a threshold method is applied to generate a binary mask, containing the necessary information to generate a distance map. Then, the Watershed function is performed over the distance map, using the markers defined by the Connected Component Analysis function that labels regions on 8-way pixel connectivity, ensuring that all regions are correctly found. Finally, individual objects are labelled for extraction through a contour method, based on border following. The achieved results show adequate region extraction capabilities when processing graphical images from different datasets, where the system correctly distinguishes the most relevant visual elements of images with minimal tweaking.
图像分割具有广泛的应用,是一个复杂且困难的预处理步骤,在自动视觉系统中起着重要作用,其准确性不仅影响分割结果,还直接影响后续任务的有效性。尽管在过去几十年中取得了许多进展,但图像分割仍然是一个具有挑战性的问题,特别是彩色图像的分割,因为图像的描述特征中存在颜色、纹理和形状的各种不均匀性。在商标图形图像分割中,除了这些困难之外,我们还必须考虑经常出现的高噪声和低分辨率。商标图形图像在构成它们的元素方面也可能非常不同,这些元素可能相互重叠且光照条件各异。由于企业标志和商标图形图像中存在巨大差异,通常很难选择一种单一方法来以产生满意结果的方式提取相关图像区域。许多集成了分水岭算法和K均值算法的混合方法涉及处理非常高质量且视觉上相似的图像,例如医学图像,这意味着可以调整任何一种方法以处理遵循特定模式的图像。商标图像彼此完全不同,通常是全彩色的。我们的系统解决了这个难题,因为它是一个通用实现,旨在通过使用可定制参数并对图像类型完全无偏地在大多数场景中工作。在本文中,我们提出了一种用于图像区域提取的混合方法,该方法侧重于自动区域提议和分割技术。特别是,我们分析了诸如K均值聚类和分水岭算法等流行技术,以及它们在混合环境中部署以应用于高度可变数据集时的有效性。所提出的系统由一个多阶段算法组成,该算法将RGB图像作为输入并产生多个输出,对应于提取的区域。在预处理步骤之后,对RGB图像执行具有随机初始质心和用户定义的 值的K均值函数,生成一个灰度分割图像,对其应用阈值方法以生成一个二进制掩码,该掩码包含生成距离图所需的信息。然后,使用由连通分量分析函数定义的标记在距离图上执行分水岭函数,该函数基于8向像素连通性标记区域,确保正确找到所有区域。最后,通过基于边界跟踪的轮廓方法对单个对象进行标记以进行提取。在处理来自不同数据集的图形图像时,所取得的结果显示出足够的区域提取能力,该系统在进行最少调整的情况下能够正确区分图像中最相关的视觉元素。