• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

音频深度伪造:一项调查。

Audio deepfakes: A survey.

作者信息

Khanjani Zahra, Watson Gabrielle, Janeja Vandana P

机构信息

Department of Information System, University of Maryland Baltimore County, Baltimore, MD, United States.

出版信息

Front Big Data. 2023 Jan 9;5:1001063. doi: 10.3389/fdata.2022.1001063. eCollection 2022.

DOI:10.3389/fdata.2022.1001063
PMID:36700137
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9869423/
Abstract

A deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. The key difference between manual editing and deepfakes is that deepfakes are AI generated or AI manipulated and closely resemble authentic artifacts. In some cases, deepfakes can be fabricated using AI-generated content in its entirety. Deepfakes have started to have a major impact on society with more generation mechanisms emerging everyday. This article makes a contribution in understanding the landscape of deepfakes, and their detection and generation methods. We evaluate various categories of deepfakes especially in audio. The purpose of this survey is to provide readers with a deeper understanding of (1) different deepfake categories; (2) how they could be created and detected; (3) more specifically, how audio deepfakes are created and detected in more detail, which is the main focus of this paper. We found that generative adversarial networks (GANs), convolutional neural networks (CNNs), and deep neural networks (DNNs) are common ways of creating and detecting deepfakes. In our evaluation of over 150 methods, we found that the majority of the focus is on video deepfakes, and, in particular, the generation of video deepfakes. We found that for text deepfakes, there are more generation methods but very few robust methods for detection, including fake news detection, which has become a controversial area of research because of the potential heavy overlaps with human generation of fake content. Our study reveals a clear need to research audio deepfakes and particularly detection of audio deepfakes. This survey has been conducted with a different perspective, compared to existing survey papers that mostly focus on just video and image deepfakes. This survey mainly focuses on audio deepfakes that are overlooked in most of the existing surveys. This article's most important contribution is to critically analyze and provide a unique source of audio deepfake research, mostly ranging from 2016 to 2021. To the best of our knowledge, this is the first survey focusing on audio deepfakes generation and detection in English.

摘要

深度伪造是指使用人工智能(AI)方法合成生成或操纵的内容或材料,旨在冒充真实内容,可包括音频、视频、图像和文本合成。人工编辑与深度伪造之间的关键区别在于,深度伪造是由人工智能生成或操纵的,与真实的制品非常相似。在某些情况下,深度伪造可以完全使用人工智能生成的内容来制作。随着每天都有更多的生成机制出现,深度伪造已开始对社会产生重大影响。本文有助于理解深度伪造的概况及其检测和生成方法。我们评估了各类深度伪造,尤其是音频方面的。本次调查的目的是让读者更深入地了解:(1)不同的深度伪造类别;(2)它们是如何创建和检测的;(3)更具体地说,音频深度伪造是如何创建和检测的,这是本文的主要重点。我们发现生成对抗网络(GAN)、卷积神经网络(CNN)和深度神经网络(DNN)是创建和检测深度伪造的常见方法。在我们对150多种方法的评估中,我们发现大多数研究集中在视频深度伪造上,尤其是视频深度伪造的生成。我们发现,对于文本深度伪造,有更多的生成方法,但用于检测的可靠方法却很少,包括假新闻检测,由于其与人类生成的虚假内容可能存在大量重叠,这已成为一个有争议的研究领域。我们的研究表明,显然需要对音频深度伪造进行研究,尤其是音频深度伪造的检测。与现有的主要关注视频和图像深度伪造的调查论文相比,本次调查是从不同的角度进行的。本次调查主要关注在大多数现有调查中被忽视的音频深度伪造。本文最重要的贡献是批判性地分析并提供了一个独特的音频深度伪造研究资源,大部分研究时间跨度从2016年到2021年。据我们所知,这是第一篇以英文撰写的专注于音频深度伪造生成和检测的调查。

相似文献

1
Audio deepfakes: A survey.音频深度伪造:一项调查。
Front Big Data. 2023 Jan 9;5:1001063. doi: 10.3389/fdata.2022.1001063. eCollection 2022.
2
A Review of Image Processing Techniques for Deepfakes.深度伪造的图像处理技术综述。
Sensors (Basel). 2022 Jun 16;22(12):4556. doi: 10.3390/s22124556.
3
Deepfakes Generation and Detection: A Short Survey.深度伪造的生成与检测:简要综述
J Imaging. 2023 Jan 13;9(1):18. doi: 10.3390/jimaging9010018.
4
Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors.深度伪造对说话者和面部识别的威胁:工具与攻击途径概述
Heliyon. 2023 Apr 3;9(4):e15090. doi: 10.1016/j.heliyon.2023.e15090. eCollection 2023 Apr.
5
Deepfake attack prevention using steganography GANs.使用隐写术生成对抗网络防止深度伪造攻击
PeerJ Comput Sci. 2022 Oct 20;8:e1125. doi: 10.7717/peerj-cs.1125. eCollection 2022.
6
A Robust Approach to Multimodal Deepfake Detection.一种用于多模态深度伪造检测的稳健方法。
J Imaging. 2023 Jun 19;9(6):122. doi: 10.3390/jimaging9060122.
7
Deepfake forensics: a survey of digital forensic methods for multimodal deepfake identification on social media.深度伪造取证:社交媒体上多模态深度伪造识别的数字取证方法综述
PeerJ Comput Sci. 2024 May 27;10:e2037. doi: 10.7717/peerj-cs.2037. eCollection 2024.
8
Countering Malicious DeepFakes: Survey, Battleground, and Horizon.对抗恶意深度伪造:综述、战场与展望
Int J Comput Vis. 2022;130(7):1678-1734. doi: 10.1007/s11263-022-01606-8. Epub 2022 May 4.
9
The Face Deepfake Detection Challenge.面部深度伪造检测挑战赛。
J Imaging. 2022 Sep 28;8(10):263. doi: 10.3390/jimaging8100263.
10
Do deepfake videos undermine our epistemic trust? A thematic analysis of tweets that discuss deepfakes in the Russian invasion of Ukraine.深度伪造视频是否破坏了我们的认知信任?对讨论俄罗斯入侵乌克兰中的深度伪造的推文的主题分析。
PLoS One. 2023 Oct 25;18(10):e0291668. doi: 10.1371/journal.pone.0291668. eCollection 2023.

引用本文的文献

1
Audio Deepfake Detection: What Has Been Achieved and What Lies Ahead.音频深度伪造检测:已取得的成果与未来展望。
Sensors (Basel). 2025 Mar 22;25(7):1989. doi: 10.3390/s25071989.
2
OpenAI's Sora and Google's Veo 2 in Action: A Narrative Review of Artificial Intelligence-driven Video Generation Models Transforming Healthcare.OpenAI的Sora和谷歌的Veo 2的实际应用:对改变医疗保健的人工智能驱动视频生成模型的叙述性综述
Cureus. 2025 Jan 17;17(1):e77593. doi: 10.7759/cureus.77593. eCollection 2025 Jan.
3
Deepfake: definitions, performance metrics and standards, datasets, and a meta-review.

本文引用的文献

1
A preliminary analysis of AI based smartphone application for diagnosis of COVID-19 using chest X-ray images.基于人工智能的智能手机应用程序利用胸部X光图像诊断新冠肺炎的初步分析。
Expert Syst Appl. 2021 Nov 30;183:115401. doi: 10.1016/j.eswa.2021.115401. Epub 2021 Jun 12.
2
Forecasting of COVID-19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells.使用带有门控循环单元(GRU)和长短期记忆(LSTM)细胞的深层循环神经网络(RNN)对2019冠状病毒病(COVID-19)进行预测。
Chaos Solitons Fractals. 2021 May;146:110861. doi: 10.1016/j.chaos.2021.110861. Epub 2021 Mar 14.
3
深度伪造:定义、性能指标与标准、数据集及综合述评
Front Big Data. 2024 Sep 4;7:1400024. doi: 10.3389/fdata.2024.1400024. eCollection 2024.
4
A systematic review of AI literacy scales.人工智能素养量表的系统评价。
NPJ Sci Learn. 2024 Aug 6;9(1):50. doi: 10.1038/s41539-024-00264-4.
Audio-based snore detection using deep neural networks.
使用深度神经网络的基于音频的鼾声检测。
Comput Methods Programs Biomed. 2021 Mar;200:105917. doi: 10.1016/j.cmpb.2020.105917. Epub 2020 Dec 25.
4
A Style-Based Generator Architecture for Generative Adversarial Networks.基于风格的生成对抗网络生成器架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4217-4228. doi: 10.1109/TPAMI.2020.2970919. Epub 2021 Nov 3.
5
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
6
Foreign accent conversion in computer assisted pronunciation training.计算机辅助语音训练中的外国口音转换
Speech Commun. 2009 Oct;51(10):920-932. doi: 10.1016/j.specom.2008.11.004.