Suppr超能文献

基于深度学习的人类感知编码的社区感知图像质量评价

Community-Aware Photo Quality Evaluation by Deeply Encoding Human Perception.

出版信息

IEEE Trans Cybern. 2022 May;52(5):3136-3146. doi: 10.1109/TCYB.2019.2937319. Epub 2022 May 19.

Abstract

Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, for example, photo retaregeting, 3-D rendering, and fashion recommendation. The conventional photo quality models are designed by characterizing the pictures from all communities (e.g., "architecture" and "colorful") indiscriminately, wherein community-specific features are not exploited explicitly. In this article, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM) and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photographs from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photograph. Meanwhile, three attributes, namely: 1) photo quality scores; weak semantic tags; and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct the gaze shifting path (GSP) for each photograph by sequentially linking the top-ranking regions from each photograph, and an aggregation-based CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photographs, a regularizer is incorporated into our LTM. Finally, given a test photograph, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comparative studies on four image sets have shown the competitiveness of our method. Besides, the eye-tracking experiments have demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.

摘要

计算摄影质量评估是计算机视觉和图形学许多任务中的一项有用技术,例如照片重定向、3-D 渲染和时尚推荐。传统的照片质量模型是通过对所有社区(例如“建筑”和“多彩”)的图片进行特征化来设计的,其中没有显式利用社区特定的特征。在本文中,我们开发了一种新的社区感知照片质量评估框架。它通过正则化潜在主题模型(LTM)揭示潜在的社区特定主题,并通过探索多种属性来捕捉人类视觉质量感知。具体来说,给定来自多个社区的大规模在线照片,我们提出了一种新的排序算法来衡量每张照片内部区域的视觉/语义吸引力。同时,在排序过程中,无缝地协同整合了三个属性:1)照片质量得分;弱语义标签;和区域间相关性。随后,我们通过依次链接每张照片的排名最高的区域来为每张照片构建注视转移路径(GSP),并基于此,基于聚合的 CNN 计算每个 GSP 的深度表示。在此基础上,我们提出了一个 LTM,以便在潜在空间中对来自多个社区的 GSP 分布进行建模。为了减轻因社区照片数量很少而导致的过拟合问题,我们在 LTM 中加入了一个正则化器。最后,给定一张测试照片,我们获得其深度 GSP 表示,其质量得分由正则化 LTM 的后验概率确定。对四个图像集的比较研究表明了我们方法的竞争力。此外,眼动追踪实验表明,我们基于排序的 GSP 与真实的人类注视运动高度一致。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验