Yang Yuxuan, Lei Zhichun, Li Changlu
School of Microelectronics, Tianjin University, Tianjin 300072, China.
Institute of Sensors and Measurements, University of Applied Sciences Ruhr West, 45479 Mülheim an der Ruhr, Germany.
Sensors (Basel). 2024 Aug 12;24(16):5221. doi: 10.3390/s24165221.
No-reference image quality assessment aims to evaluate image quality based on human subjective perceptions. Current methods face challenges with insufficient ability to focus on global and local information simultaneously and information loss due to image resizing. To address these issues, we propose a model that combines Swin-Transformer and natural scene statistics. The model utilizes Swin-Transformer to extract multi-scale features and incorporates a feature enhancement module and deformable convolution to improve feature representation, adapting better to structural variations in images, apply dual-branch attention to focus on key areas, and align the assessment more closely with human visual perception. The Natural Scene Statistics compensates information loss caused by image resizing. Additionally, we use a normalized loss function to accelerate model convergence and enhance stability. We evaluate our model on six standard image quality assessment datasets (both synthetic and authentic), and show that our model achieves advanced results across multiple datasets. Compared to the advanced DACNN method, our model achieved Spearman rank correlation coefficients of 0.922 and 0.923 on the KADID and KonIQ datasets, respectively, representing improvements of 1.9% and 2.4% over this method. It demonstrated outstanding performance in handling both synthetic and authentic scenes.
无参考图像质量评估旨在基于人类主观感知来评估图像质量。当前的方法面临着一些挑战,即同时关注全局和局部信息的能力不足,以及由于图像缩放导致的信息丢失。为了解决这些问题,我们提出了一种结合Swin-Transformer和自然场景统计的模型。该模型利用Swin-Transformer提取多尺度特征,并结合一个特征增强模块和可变形卷积来改善特征表示,从而更好地适应图像中的结构变化,应用双分支注意力来聚焦关键区域,并使评估更紧密地与人类视觉感知对齐。自然场景统计弥补了图像缩放造成的信息损失。此外,我们使用归一化损失函数来加速模型收敛并增强稳定性。我们在六个标准图像质量评估数据集(包括合成数据集和真实数据集)上评估了我们的模型,并表明我们的模型在多个数据集上都取得了先进的结果。与先进的DACNN方法相比,我们的模型在KADID和KonIQ数据集上分别取得了0.922和0.923的斯皮尔曼等级相关系数,比该方法分别提高了1.9%和2.4%。它在处理合成场景和真实场景方面都表现出了出色的性能。