• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有组合损失函数的高效注意力分支网络用于自动语音识别欺骗检测

Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection.

作者信息

Rostami Amir Mohammad, Homayounpour Mohammad Mehdi, Nickabadi Ahmad

机构信息

Department of Computer Engineering, Amirkabir University of Technology, Tehran, Iran.

出版信息

Circuits Syst Signal Process. 2023 Feb 23:1-19. doi: 10.1007/s00034-023-02314-5.

DOI:10.1007/s00034-023-02314-5
PMID:36852137
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9947936/
Abstract

Many endeavors have sought to develop countermeasure techniques as enhancements on Automatic Speaker Verification (ASV) systems, in order to make them more robust against spoof attacks. As evidenced by the latest ASVspoof 2019 countermeasure challenge, models currently deployed for the task of ASV are, at their best, devoid of suitable degrees of generalization to unseen attacks. A joint improvement of components of ASV spoof detection systems including the classifier, feature extraction phase, and model loss function may lead to a better detection of attacks by these systems. Accordingly, the present study proposes the Efficient Attention Branch Network (EABN) architecture with a combined loss function to address the model generalization to unseen attacks. The EABN is based on attention and perception branches. The attention branch provides an attention mask that improves the classification performance and at the same time is interpretable from a human point of view. The perception branch, is used for our main purpose which is spoof detection. The new EfficientNet-A0 architecture was optimized and employed for the perception branch, with nearly ten times fewer parameters and approximately seven times fewer floating-point operations than the SE-Res2Net50 as the best existing network. The proposed method on ASVspoof 2019 dataset achieved EER = 0.86% and t-DCF = 0.0239 in the Physical Access (PA) scenario using the logPowSpec as the input feature extraction method. Furthermore, using the LFCC feature, and the SE-Res2Net50 for the perception branch, the proposed model achieved EER = 1.89% and t-DCF = 0.507 in the Logical Access (LA) scenario, which to the best of our knowledge, is the best single system ASV spoofing countermeasure method.

摘要

许多研究致力于开发对抗技术,以增强自动说话人验证(ASV)系统,使其更能抵御欺骗攻击。正如最新的ASVspoof 2019对抗挑战赛所证明的那样,目前用于ASV任务的模型,充其量也缺乏对未知攻击的适当泛化能力。联合改进ASV欺骗检测系统的组件,包括分类器、特征提取阶段和模型损失函数,可能会使这些系统更好地检测攻击。因此,本研究提出了一种具有组合损失函数的高效注意力分支网络(EABN)架构,以解决模型对未知攻击的泛化问题。EABN基于注意力分支和感知分支。注意力分支提供一个注意力掩码,可提高分类性能,同时从人类角度来看是可解释的。感知分支用于我们的主要目的,即欺骗检测。新的EfficientNet-A0架构被优化并用于感知分支,其参数比现有的最佳网络SE-Res2Net50少近十倍,浮点运算次数约少七倍。在ASVspoof 2019数据集上,所提出的方法在物理访问(PA)场景中使用logPowSpec作为输入特征提取方法时,实现了EER = 0.86%和t-DCF = 0.0239。此外,在逻辑访问(LA)场景中,使用LFCC特征,并将SE-Res2Net50用于感知分支,所提出的模型实现了EER = 1.89%和t-DCF = 0.507,据我们所知,这是最佳的单系统ASV欺骗对抗方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/f53bb222461a/34_2023_2314_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/90126612dda7/34_2023_2314_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/592f95b918e2/34_2023_2314_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/fc3dbf62d1f6/34_2023_2314_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/9e67f00f5dc7/34_2023_2314_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/450465df1a84/34_2023_2314_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/888effa96ca7/34_2023_2314_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/f53bb222461a/34_2023_2314_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/90126612dda7/34_2023_2314_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/592f95b918e2/34_2023_2314_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/fc3dbf62d1f6/34_2023_2314_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/9e67f00f5dc7/34_2023_2314_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/450465df1a84/34_2023_2314_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/888effa96ca7/34_2023_2314_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/545e/9947936/f53bb222461a/34_2023_2314_Fig7_HTML.jpg

相似文献

1
Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection.具有组合损失函数的高效注意力分支网络用于自动语音识别欺骗检测
Circuits Syst Signal Process. 2023 Feb 23:1-19. doi: 10.1007/s00034-023-02314-5.
2
Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients.使用考虑频谱图和梅尔频率倒谱系数的神经网络组件进行语音欺骗检测。
PeerJ Comput Sci. 2023 Dec 18;9:e1740. doi: 10.7717/peerj-cs.1740. eCollection 2023.
3
A blended framework for audio spoof detection with sequential models and bags of auditory bites.一种结合了序列模型和音频片段包的音频伪造检测的混合框架。
Sci Rep. 2024 Aug 30;14(1):20192. doi: 10.1038/s41598-024-71026-w.
4
Gaussian-Filtered High-Frequency-Feature Trained Optimized BiLSTM Network for Spoofed-Speech Classification.基于高斯滤波高频特征训练优化的 BiLSTM 网络的语音伪造分类。
Sensors (Basel). 2023 Jul 24;23(14):6637. doi: 10.3390/s23146637.
5
Spoofing Detection in Automatic Speaker Verification Systems Using DNN Classifiers and Dynamic Acoustic Features.使用深度神经网络分类器和动态声学特征的自动说话人验证系统中的欺骗检测
IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):4633-4644. doi: 10.1109/TNNLS.2017.2771947. Epub 2017 Dec 4.
6
Toward Realigning Automatic Speaker Verification in the Era of COVID-19.面向新冠疫情时代的自动说话人验证技术的再调整。
Sensors (Basel). 2022 Mar 30;22(7):2638. doi: 10.3390/s22072638.
7
Spoof Trace Disentanglement for Generic Face Anti-Spoofing.通用人脸防欺骗中的伪造痕迹解缠。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3813-3830. doi: 10.1109/TPAMI.2022.3176387. Epub 2023 Feb 3.
8
BPCNN: Bi-Point Input for Convolutional Neural Networks in Speaker Spoofing Detection.BPCNN:用于说话人伪造检测的卷积神经网络的双点输入。
Sensors (Basel). 2022 Jun 14;22(12):4483. doi: 10.3390/s22124483.
9
Short-time speaker verification with different speaking style utterances.采用不同说话风格语音的短时说话人验证。
PLoS One. 2020 Nov 11;15(11):e0241809. doi: 10.1371/journal.pone.0241809. eCollection 2020.
10
DEFAEK: Domain Effective Fast Adaptive Network for Face Anti-Spoofing.DEFAEK:用于人脸反欺骗的域有效快速自适应网络。
Neural Netw. 2023 Apr;161:83-91. doi: 10.1016/j.neunet.2023.01.018. Epub 2023 Jan 25.