Kumar Ravi Kant, Garain Jogendra, Kisku Dakshina Ranjan, Sanyal Goutam
Department of Computer Science and Engineering, National Institute of Technology Durgapur, Durgapur, West Bengal 713209 India.
Cogn Neurodyn. 2019 Apr;13(2):125-149. doi: 10.1007/s11571-018-9515-z. Epub 2019 Jan 2.
In a general scenario, while attending a scene containing multiple faces or looking towards a group photograph, our attention does not go equal towards all the faces. It means, we are naturally biased towards some faces. This biasness happens due to availability of dominant perceptual features in those faces. In visual saliency terminology it can be called as 'salient face'. Human's focus their gaze towards a face which carries the 'dominating look' in the crowd. This happens due to comparative saliency of the faces. Saliency of a face is determined by its feature dissimilarity with the surrounding faces. In this context there is a big role of human psychology and its cognitive science too. Therefore, enormous researches have been carried out towards modeling the computer vision system like human's vision. This paper proposed a graphical based bottom up approach to point up the salient face in the crowd or in an image having multiple faces. In this novel method, visual saliencies of faces have been calculated based on the intensity values, facial areas and their relative spatial distances. Experiment has been conducted on gray scale images. In order to verify this experiment, three level of validation has been done. In the first level, our results have been verified with the prepared ground truth. In the second level, intensity scores of proposed saliency maps have been cross verified with the saliency score. In the third level, saliency map is validated with some standard parameters. The results are found to be interesting and in some aspects saliency predictions are like human vision system. The evaluation made with the proposed approach shows moderately boost up results and hence, this idea can be useful in the future modeling of intelligent vision (robot vision) system.
在一般情况下,当身处包含多张面孔的场景中或看向一张集体照片时,我们的注意力并非均匀地分配到所有面孔上。这意味着,我们天生会对某些面孔存在偏好。这种偏好是由于这些面孔中存在显著的感知特征。在视觉显著性术语中,这可被称为“显著面孔”。人们会将目光聚焦于在人群中具有“主导外观”的面孔。这是由于面孔之间的相对显著性所致。面孔的显著性由其与周围面孔的特征差异决定。在这种情况下,人类心理学及其认知科学也起着重要作用。因此,针对构建类似人类视觉的计算机视觉系统已经开展了大量研究。本文提出了一种基于图形的自下而上的方法,以指出人群中或包含多张面孔的图像中的显著面孔。在这种新颖的方法中,基于强度值、面部区域及其相对空间距离来计算面孔的视觉显著性。实验是在灰度图像上进行的。为了验证该实验,进行了三个层次的验证。在第一个层次中,我们的结果与准备好的地面真值进行了验证。在第二个层次中,所提出的显著性图的强度得分与显著性分数进行了交叉验证。在第三个层次中,显著性图用一些标准参数进行了验证。结果发现很有趣,并且在某些方面显著性预测类似于人类视觉系统。用所提出的方法进行的评估显示结果有适度的提升,因此,这个想法在未来智能视觉(机器人视觉)系统的建模中可能会很有用。