Moreira Daniel, Bharati Aparna, Brogan Joel, Pinto Allan, Parowski Michael, Bowyer Kevin W, Flynn Patrick J, Rocha Anderson, Scheirer Walter J
IEEE Trans Image Process. 2018 Aug 16. doi: 10.1109/TIP.2018.2865674.
Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation (e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results.
现有技术表明,通过图像处理和计算机视觉技术,可以估计应用于单个图像内容以获得新图像的变换类型和参数。给定大量图像语料库和一个查询图像,一个有趣的进一步步骤是检索查询图像中存在其内容的原始图像集,以及给定原始图像生成查询图像的详细变换序列。这是一个最近被称为图像溯源分析的问题。在当前公共媒体操纵(例如,假新闻和模因分享)的时代,获取图像变换历史对于事实核查和作者身份验证以及许多其他应用来说都很重要。本文提出了一种用于图像溯源分析的端到端处理管道,该管道可在现实世界规模上运行。它采用了一种针对手头问题量身定制的前沿图像过滤解决方案,以及用于获取表达图像(作为节点)如何在祖先关系上相连的溯源图的新颖技术。针对管道的每个阶段提供了一组全面的实验,将所提出的解决方案与采用先前发布的数据集的最新结果进行比较。此外,这项工作引入了一个来自社交媒体网站Reddit的真实世界溯源案例新数据集以及基线结果。