Eybposh M Hossein, Cai Changjia, Moossavi Aram, Rodriguez-Romaguera Jose, Pégard Nicolas C
Department of Applied Physical Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
Joint Department of Biomedical Engineering, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
Sci Rep. 2024 Aug 29;14(1):20066. doi: 10.1038/s41598-024-70469-5.
Effectively assessing the realism and naturalness of images in virtual (VR) and augmented (AR) reality applications requires Full Reference Image Quality Assessment (FR-IQA) metrics that closely align with human perception. Deep learning-based IQAs that are trained on human-labeled data have recently shown promise in generic computer vision tasks. However, their performance decreases in applications where perfect matches between the reference and the distorted images should not be expected, or whenever distortion patterns are restricted to specific domains. Tackling this issue necessitates training a task-specific neural network, yet generating human-labeled FR-IQAs is costly, and deep learning typically demands substantial labeled data. To address these challenges, we developed ConIQA, a deep learning-based IQA that leverages consistency training and a novel data augmentation method to learn from both labeled and unlabeled data. This makes ConIQA well-suited for contexts with scarce labeled data. To validate ConIQA, we considered the example application of Computer-Generated Holography (CGH) where specific artifacts such as ringing, speckle, and quantization errors routinely occur, yet are not explicitly accounted for by existing IQAs. We developed a new dataset, HQA1k, that comprises 1000 natural images each paired with an image rendered using various popular CGH algorithms, and quality-rated by thirteen human participants. Our results show that ConIQA achieves superior Pearson (0.98), Spearman (0.965), and Kendall's tau (0.86) correlations over fifteen FR-IQA metrics by up to 5%, showcasing significant improvements in aligning with human perception on the HQA1k dataset.
有效评估虚拟(VR)和增强现实(AR)应用中图像的真实感和自然度需要与人类感知紧密匹配的全参考图像质量评估(FR-IQA)指标。最近,在人类标注数据上训练的基于深度学习的IQA在通用计算机视觉任务中显示出了前景。然而,在不应期望参考图像和失真图像之间完美匹配的应用中,或者当失真模式仅限于特定领域时,它们的性能会下降。解决这个问题需要训练一个特定任务的神经网络,然而生成人类标注的FR-IQA成本很高,并且深度学习通常需要大量的标注数据。为了应对这些挑战,我们开发了ConIQA,一种基于深度学习的IQA,它利用一致性训练和一种新颖的数据增强方法从标注数据和未标注数据中学习。这使得ConIQA非常适合标注数据稀缺的情况。为了验证ConIQA,我们考虑了计算机生成全息术(CGH)的示例应用,在该应用中,诸如振铃、散斑和量化误差等特定伪像经常出现,但现有IQA并未明确考虑这些因素。我们开发了一个新的数据集HQA1k,它包含1000张自然图像,每张图像都与使用各种流行的CGH算法渲染的图像配对,并由13名人类参与者进行质量评级。我们的结果表明,ConIQA在十五个FR-IQA指标上实现了卓越的皮尔逊(0.98)、斯皮尔曼(0.965)和肯德尔tau(0.86)相关性,比其他指标高出5%,在HQA1k数据集上与人类感知的一致性方面展示出显著改进。