• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数字水印对主观语音质量的评估。

Evaluation of digital watermarking on subjective speech quality.

机构信息

Department of Measurement, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, Prague, 166 27, Czech Republic.

出版信息

Sci Rep. 2021 Oct 12;11(1):20185. doi: 10.1038/s41598-021-99811-x.

DOI:10.1038/s41598-021-99811-x
PMID:34642471
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8511066/
Abstract

New methods of securing the distribution of audio content have been widely deployed in the last twenty years. Their impact on perceptive quality has, however, only been seldomly the subject of recent extensive research. We review digital speech watermarking state of the art and provide subjective testing of watermarked speech samples. Latest speech watermarking techniques are listed, with their specifics and potential for further development. Their current and possible applications are evaluated. Open-source software designed to embed watermarking patterns in audio files is used to produce a set of samples that satisfies the requirements of modern speech-quality subjective assessments. The patchwork algorithm that is coded in the application is mainly considered in this analysis. Different watermark robustness levels are used, which allow determining the threshold of detection to human listeners. The subjective listening tests are conducted following ITU-T P.800 Recommendation, which precisely defines the conditions and requirements for subjective testing. Further analysis tries to determine the effects of noise and various disturbances on watermarked speech's perceived quality. A threshold of intelligibility is estimated to allow further openings on speech compression techniques with watermarking. The impact of language or social background is evaluated through an additional experiment involving two groups of listeners. Results show significant robustness of the watermarking implementation, retaining both a reasonable net subjective audio quality and security attributes, despite mild levels of distortion and noise. Extended experiments with Chinese listeners open the door to formulate a hypothesis on perception variations with geographical and social backgrounds.

摘要

在过去的二十年中,广泛采用了新的方法来确保音频内容的分发。然而,它们对感知质量的影响很少成为最近广泛研究的主题。我们回顾了数字语音水印的最新技术,并对经过水印处理的语音样本进行了主观测试。列出了最新的语音水印技术,以及它们的具体信息和进一步开发的潜力。评估了它们当前和可能的应用。使用设计用于在音频文件中嵌入水印模式的开源软件来生成一组满足现代语音质量主观评估要求的样本。在这项分析中,主要考虑了应用程序中编码的补丁算法。使用不同的水印鲁棒性级别,可以确定人类听众的检测阈值。根据 ITU-T P.800 建议书进行主观听力测试,该建议书精确地定义了主观测试的条件和要求。进一步的分析试图确定噪声和各种干扰对经过水印处理的语音感知质量的影响。估计可懂度阈值,以允许进一步开放具有水印的语音压缩技术。通过涉及两组听众的额外实验来评估语言或社会背景的影响。结果表明,即使存在轻度失真和噪声,该水印实现具有显著的稳健性,保留了合理的净主观音频质量和安全属性。与中国听众进行的扩展实验为制定与地理和社会背景相关的感知变化假设打开了大门。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/19d1974ef250/41598_2021_99811_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/2fa9389163da/41598_2021_99811_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/426fe8baa832/41598_2021_99811_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/c9858f226841/41598_2021_99811_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/a357237b7f09/41598_2021_99811_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/f2777b95902b/41598_2021_99811_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/4e84b892e906/41598_2021_99811_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/19d1974ef250/41598_2021_99811_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/2fa9389163da/41598_2021_99811_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/426fe8baa832/41598_2021_99811_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/c9858f226841/41598_2021_99811_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/a357237b7f09/41598_2021_99811_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/f2777b95902b/41598_2021_99811_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/4e84b892e906/41598_2021_99811_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/493f/8511066/19d1974ef250/41598_2021_99811_Fig7_HTML.jpg

相似文献

1
Evaluation of digital watermarking on subjective speech quality.数字水印对主观语音质量的评估。
Sci Rep. 2021 Oct 12;11(1):20185. doi: 10.1038/s41598-021-99811-x.
2
Adaptive multiwavelet-based watermarking through JPW masking.基于 JPW 掩蔽的自适应多小波水印技术。
IEEE Trans Image Process. 2011 Apr;20(4):1047-60. doi: 10.1109/TIP.2010.2079551. Epub 2010 Sep 27.
3
Spread spectrum image watermarking based on perceptual quality metric.基于感知质量度量的扩频图像水印。
IEEE Trans Image Process. 2011 Nov;20(11):3207-18. doi: 10.1109/TIP.2011.2146263. Epub 2011 Apr 21.
4
Robust image watermarking based on multiband wavelets and empirical mode decomposition.基于多波段小波和经验模态分解的鲁棒图像水印算法
IEEE Trans Image Process. 2007 Aug;16(8):1956-66. doi: 10.1109/tip.2007.901206.
5
Multiscale fragile watermarking based on the Gaussian mixture model.基于高斯混合模型的多尺度脆弱水印技术
IEEE Trans Image Process. 2006 Oct;15(10):3189-200. doi: 10.1109/tip.2006.877310.
6
Multipurpose image watermarking algorithm based on multistage vector quantization.基于多级矢量量化的多功能图像水印算法
IEEE Trans Image Process. 2005 Jun;14(6):822-31. doi: 10.1109/tip.2005.847324.
7
Analysis and design of watermarking algorithms for improved resistance to compression.用于提高抗压缩能力的水印算法的分析与设计
IEEE Trans Image Process. 2004 Feb;13(2):126-44. doi: 10.1109/tip.2004.823830.
8
Ergodic chaotic parameter modulation with application to digital image watermarking.遍历混沌参数调制及其在数字图像水印中的应用。
IEEE Trans Image Process. 2005 Oct;14(10):1590-602. doi: 10.1109/tip.2005.854475.
9
An optimal detector structure for the fourier descriptors domain watermarking of 2D vector graphics.用于二维矢量图形傅里叶描述符域水印的最优检测器结构。
IEEE Trans Vis Comput Graph. 2007 Sep-Oct;13(5):851-63. doi: 10.1109/TVCG.2007.1050.
10
Dual domain watermarking for authentication and compression of cultural heritage images.用于文化遗产图像认证与压缩的双域水印技术。
IEEE Trans Image Process. 2004 Mar;13(3):430-48. doi: 10.1109/tip.2003.821552.

引用本文的文献

1
An improved reversible watermarking scheme using embedding optimization and quaternion moments.一种使用嵌入优化和四元数矩的改进型可逆水印方案。
Sci Rep. 2024 Aug 9;14(1):18485. doi: 10.1038/s41598-024-69511-3.

本文引用的文献

1
The Diversity of Tone Languages and the Roles of Pitch Variation in Non-tone Languages: Considerations for Tone Perception Research.声调语言的多样性以及音高变化在非声调语言中的作用:声调感知研究的思考
Front Psychol. 2019 Feb 26;10:364. doi: 10.3389/fpsyg.2019.00364. eCollection 2019.