使用逐像素特征提取和卷积神经网络检测人工智能生成的图像。

Detection of AI-Created Images Using Pixel-Wise Feature Extraction and Convolutional Neural Networks.

作者信息

Martin-Rodriguez Fernando, Garcia-Mojon Rocio, Fernandez-Barciela Monica

机构信息

AtlanTTic Research Center for Telecommunication Technologies, University of Vigo, 36310 Vigo, Spain.

出版信息

Sensors (Basel). 2023 Nov 8;23(22):9037. doi: 10.3390/s23229037.

DOI:10.3390/s23229037

PMID:38005425

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10674908/

Abstract

Generative AI has gained enormous interest nowadays due to new applications like ChatGPT, DALL E, Stable Diffusion, and Deepfake. In particular, DALL E, Stable Diffusion, and others (Adobe Firefly, ImagineArt, etc.) can create images from a text prompt and are even able to create photorealistic images. Due to this fact, intense research has been performed to create new image forensics applications able to distinguish between real captured images and videos and artificial ones. Detecting forgeries made with Deepfake is one of the most researched issues. This paper is about another kind of forgery detection. The purpose of this research is to detect photorealistic AI-created images versus real photos coming from a physical camera. Id est, making a binary decision over an image, asking whether it is artificially or naturally created. Artificial images do not need to try to represent any real object, person, or place. For this purpose, techniques that perform a pixel-level feature extraction are used. The first one is Photo Response Non-Uniformity (PRNU). PRNU is a special noise due to imperfections on the camera sensor that is used for source camera identification. The underlying idea is that AI images will have a different PRNU pattern. The second one is error level analysis (ELA). This is another type of feature extraction traditionally used for detecting image editing. ELA is being used nowadays by photographers for the manual detection of AI-created images. Both kinds of features are used to train convolutional neural networks to differentiate between AI images and real photographs. Good results are obtained, achieving accuracy rates of over 95%. Both extraction methods are carefully assessed by computing precision/recall and F-score measurements.

摘要

由于ChatGPT、DALL E、Stable Diffusion和Deepfake等新应用，生成式人工智能如今引起了极大的关注。特别是DALL E、Stable Diffusion以及其他工具（Adobe Firefly、ImagineArt等）可以根据文本提示创建图像，甚至能够创建逼真的图像。基于这一事实，人们进行了深入研究，以创建能够区分真实拍摄的图像和视频与人工生成的图像的新图像取证应用。检测用Deepfake制作的伪造品是研究最多的问题之一。本文关注的是另一种伪造检测。这项研究的目的是检测逼真的人工智能生成的图像与来自物理相机的真实照片。也就是说，对一幅图像做出二元决策，判断它是人工生成的还是自然生成的。人工图像不需要试图呈现任何真实的物体、人物或地点。为此，使用了执行像素级特征提取的技术。第一种是光电响应非均匀性（PRNU）。PRNU是由于相机传感器的缺陷而产生的一种特殊噪声，用于源相机识别。其基本思想是人工智能生成的图像将具有不同的PRNU模式。第二种是误差水平分析（ELA）。这是传统上用于检测图像编辑的另一种特征提取类型。如今，摄影师使用ELA来手动检测人工智能生成的图像。这两种特征都用于训练卷积神经网络，以区分人工智能生成的图像和真实照片。取得了良好的结果，准确率超过95%。通过计算精确率/召回率和F分数测量，对这两种提取方法进行了仔细评估。