Department of Computer Science and Technology, Xinzhou Teachers University, Xinzhou, China.
School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, China.
PLoS One. 2020 Jun 1;15(6):e0234014. doi: 10.1371/journal.pone.0234014. eCollection 2020.
With the rapid development of the Internet and the increasing popularity of mobile devices, the availability of digital image resources is increasing exponentially. How to rapidly and effectively retrieve and organize image information has been a hot issue that urgently must be solved. In the field of image retrieval, image auto-annotation remains a basic and challenging task. Targeting the drawbacks of the low accuracy rate and high memory resource consumption of current multilabel annotation methods, this study proposed a CM-supplement network model. This model combines the merits of cavity convolutions, Inception modules and a supplement network. The replacement of common convolutions with cavity convolutions enlarged the receptive field without increasing the number of parameters. The incorporation of Inception modules enables the model to extract image features at different scales with less memory consumption than before. The adoption of the supplement network enables the model to obtain the negative features of images. After 100 training iterations on the PASCAL VOC 2012 dataset, the proposed model achieved an overall annotation accuracy rate of 94.5%, which increased by 10.0 and 1.1 percentage points compared with the traditional convolution neural network (CNN) and double-channel CNN (DCCNN). After stabilization, this model achieved an accuracy of up to 96.4%. Moreover, the number of parameters in the DCCNN was more than 1.5 times that of the CM-supplement network. Without increasing the amount of memory resources consumed, the proposed CM-supplement network can achieve comparable or even better annotation effects than a DCCNN.
随着互联网的飞速发展和移动设备的普及,数字图像资源的可用性呈指数级增长。如何快速有效地检索和组织图像信息已经成为一个亟待解决的热点问题。在图像检索领域,图像自动标注仍然是一个基本而具有挑战性的任务。针对当前多标签标注方法准确率低、内存资源消耗高的缺点,本研究提出了一种 CM-补充网络模型。该模型结合了空洞卷积、Inception 模块和补充网络的优点。用空洞卷积代替普通卷积扩大了感受野,而不增加参数数量。Inception 模块的引入使得模型能够以较少的内存消耗提取不同尺度的图像特征。补充网络的采用使得模型能够获得图像的负特征。在 PASCAL VOC 2012 数据集上经过 100 次训练迭代后,所提出的模型的整体标注准确率达到 94.5%,与传统卷积神经网络(CNN)和双通道卷积神经网络(DCCNN)相比,分别提高了 10.0 和 1.1 个百分点。在稳定后,该模型的准确率高达 96.4%。此外,DCCNN 的参数数量是 CM-补充网络的 1.5 倍以上。在不增加内存资源消耗的情况下,所提出的 CM-补充网络可以达到与 DCCNN 相当甚至更好的标注效果。