IEEE Trans Image Process. 2015 Oct;24(10):2996-3008. doi: 10.1109/TIP.2015.2431437. Epub 2015 May 8.
In this paper, we study a novel problem of classifying covert photos, whose acquisition processes are intentionally concealed from the subjects being photographed. Covert photos are often privacy invasive and, if distributed over Internet, can cause serious consequences. Automatic identification of such photos, therefore, serves as an important initial step toward further privacy protection operations. The problem is, however, very challenging due to the large semantic similarity between covert and noncovert photos, the enormous diversity in the photographing process and environment of cover photos, and the difficulty to collect an effective data set for the study. Attacking these challenges, we make three consecutive contributions. First, we collect a large data set containing 2500 covert photos, each of them is verified rigorously and carefully. Second, we conduct a user study on how humans distinguish covert photos from noncovert ones. The user study not only provides an important evaluation baseline, but also suggests fusing heterogeneous information for an automatic solution. Our third contribution is a covert photo classification algorithm that fuses various image features and visual attributes in the multiple kernel learning framework. We evaluate the proposed approach on the collected data set in comparison with other modern image classifiers. The results show that our approach achieves an average classification rate (1-EER) of 0.8940, which significantly outperforms other competitors as well as human's performance.
在本文中,我们研究了一个新的问题,即对故意隐藏拍摄对象的偷拍照片进行分类。偷拍照片通常会侵犯隐私,如果在互联网上传播,可能会造成严重后果。因此,自动识别此类照片是进一步进行隐私保护操作的重要初始步骤。然而,由于偷拍照片和非偷拍照片之间存在很大的语义相似性、偷拍照片拍摄过程和环境的多样性以及难以收集有效的研究数据集,因此该问题极具挑战性。针对这些挑战,我们做出了三个连续的贡献。首先,我们收集了一个包含 2500 张偷拍照片的大型数据集,每张照片都经过了严格和仔细的验证。其次,我们进行了一项关于人类如何区分偷拍照片和非偷拍照片的用户研究。用户研究不仅提供了重要的评估基准,还提出了融合异构信息的自动解决方案。我们的第三个贡献是一种在多核学习框架中融合各种图像特征和视觉属性的偷拍照片分类算法。我们在收集的数据集中对所提出的方法进行了评估,并与其他现代图像分类器进行了比较。结果表明,我们的方法在平均分类率(1-EER)方面达到了 0.8940,明显优于其他竞争对手和人类的表现。