IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2165-2178. doi: 10.1109/TPAMI.2019.2914392. Epub 2019 May 2.
Memorability of an image is a characteristic determined by the human observers' ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. The current study aims to enhance our understanding and prediction of image memorability, improving upon existing approaches by incorporating the properties of cumulative human annotations. We propose a new concept called the Visual Memory Schema (VMS) referring to an organization of image components human observers share when encoding and recognizing images. The concept of VMS is operationalised by asking human observers to define memorable regions of images they were asked to remember during an episodic memory test. We then statistically assess the consistency of VMSs across observers for either correctly or incorrectly recognised images. The associations of the VMSs with eye fixations and saliency are analysed separately as well. Lastly, we adapt various deep learning architectures for the reconstruction and prediction of memorable regions in images and analyse the results when using transfer learning at the outputs of different convolutional network layers.
图像的可记性是由人类观察者记住他们所看到的图像的能力决定的一个特征。然而,最近关于图像可记性的研究将其定义为一种可以独立于观察者获得的内在属性。本研究旨在通过整合累积人类注释的特性,来增强我们对图像可记性的理解和预测。我们提出了一个新的概念,称为视觉记忆模式(VMS),指的是人类观察者在对图像进行编码和识别时共享的图像组件的组织方式。VMS 的概念是通过要求人类观察者定义他们在情节记忆测试中被要求记住的图像的可记忆区域来实现的。然后,我们统计评估观察者对正确或错误识别的图像的 VMS 一致性。还分别分析了 VMS 与眼动和显著度的关联。最后,我们为图像中可记忆区域的重建和预测改编了各种深度学习架构,并分析了在不同卷积网络层的输出处使用迁移学习时的结果。