Istituto di Informatica e Telematica-CNR, Pisa, Italy.
Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo"-CNR, Pisa, Italy.
PLoS One. 2021 May 13;16(5):e0251415. doi: 10.1371/journal.pone.0251415. eCollection 2021.
The recent advances in language modeling significantly improved the generative capabilities of deep neural models: in 2019 OpenAI released GPT-2, a pre-trained language model that can autonomously generate coherent, non-trivial and human-like text samples. Since then, ever more powerful text generative models have been developed. Adversaries can exploit these tremendous generative capabilities to enhance social bots that will have the ability to write plausible deepfake messages, hoping to contaminate public debate. To prevent this, it is crucial to develop deepfake social media messages detection systems. However, to the best of our knowledge no one has ever addressed the detection of machine-generated texts on social networks like Twitter or Facebook. With the aim of helping the research in this detection field, we collected the first dataset of real deepfake tweets, TweepFake. It is real in the sense that each deepfake tweet was actually posted on Twitter. We collected tweets from a total of 23 bots, imitating 17 human accounts. The bots are based on various generation techniques, i.e., Markov Chains, RNN, RNN+Markov, LSTM, GPT-2. We also randomly selected tweets from the humans imitated by the bots to have an overall balanced dataset of 25,572 tweets (half human and half bots generated). The dataset is publicly available on Kaggle. Lastly, we evaluated 13 deepfake text detection methods (based on various state-of-the-art approaches) to both demonstrate the challenges that Tweepfake poses and create a solid baseline of detection techniques. We hope that TweepFake can offer the opportunity to tackle the deepfake detection on social media messages as well.
2019 年,OpenAI 发布了 GPT-2,这是一种可以自主生成连贯、非平凡且类人文本样本的预训练语言模型。从那时起,功能更强大的文本生成模型不断被开发出来。攻击者可以利用这些巨大的生成能力来增强社交机器人,使其能够编写可信的深度伪造消息,希望污染公共辩论。为了防止这种情况发生,开发深度伪造社交媒体消息检测系统至关重要。然而,据我们所知,目前还没有人研究过在 Twitter 或 Facebook 等社交网络上检测机器生成的文本。为了帮助该检测领域的研究,我们收集了第一个真实的深度伪造推文数据集 TweepFake。它的真实性在于,每个深度伪造推文实际上都是在 Twitter 上发布的。我们从总共 23 个机器人收集推文,模仿了 17 个人类账户。这些机器人基于各种生成技术,如马尔可夫链、RNN、RNN+马尔可夫、LSTM、GPT-2。我们还随机从机器人模仿的人类中选择推文,以获得一个由 25572 条推文组成的总体平衡数据集(一半是人类生成的,一半是机器人生成的)。该数据集在 Kaggle 上公开。最后,我们评估了 13 种深度伪造文本检测方法(基于各种最先进的方法),以展示 TweepFake 带来的挑战,并创建一个深度伪造检测技术的可靠基线。我们希望 TweepFake 能够为社交媒体消息的深度伪造检测提供机会。