专家无法可靠地检测到 AI 生成的组织学数据。

Experts fail to reliably detect AI-generated histological data.

机构信息

Institute for Physiology, Faculty of Medicine, University of Freiburg, 79108, Freiburg, Germany.

BrainLinks-BrainTools, IMBIT (Institute for Machine-Brain Interfacing Technology), University of Freiburg, Georges-Köhler-Allee 201, 79110, Freiburg, Germany.

出版信息

Sci Rep. 2024 Nov 19;14(1):28677. doi: 10.1038/s41598-024-73913-8.

DOI:10.1038/s41598-024-73913-8

PMID:39562595

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11577117/

Abstract

AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, these methods are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures such as histological tissue samples, which are harder to forge manually. Here, we use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples. In a large study with over 800 participants, we study the ability of human subjects to discriminate between these artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities are sufficient to create convincing images, necessitating methods and policies to detect fabricated data in scientific publications.

摘要

基于人工智能的图像生成方法近年来取得了前所未有的进展，这不仅对图像取证技术提出了挑战，也对人类的感知能力提出了挑战。因此，这些方法有望在数据的欺诈性伪造中发挥越来越重要的作用。这包括具有复杂内在结构的图像，如组织学样本，这些图像更难手动伪造。在这里，我们使用最先进的生成算法之一——稳定扩散，来创建这样一组人工组织学样本。在一项有 800 多名参与者参与的大型研究中，我们研究了人类受试者区分这些人工和真实组织学图像的能力。尽管他们的表现优于天真的参与者，但我们发现，即使是专家也无法可靠地识别伪造的数据。虽然参与者的表现取决于所使用的训练数据量，但即使是少量的数据也足以生成令人信服的图像，因此需要制定方法和政策来检测科学出版物中的伪造数据。