Suppr超能文献

多倒谱投影表示策略在嗓音障碍检测中的实验分析。

An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.

机构信息

Department of Computer Science and Statistics, Institute of Biosciences, Letters and Exact Sciences, São Paulo State University, São José do Rio Preto 15054-000, SP, Brazil.

Federal Institute of São Paulo, São José do Rio Preto 15030-070, SP, Brazil.

出版信息

Sensors (Basel). 2023 May 30;23(11):5196. doi: 10.3390/s23115196.

Abstract

Biometrics-based authentication has become the most well-established form of user recognition in systems that demand a certain level of security. For example, the most commonplace social activities stand out, such as access to the work environment or to one's own bank account. Among all biometrics, voice receives special attention due to factors such as ease of collection, the low cost of reading devices, and the high quantity of literature and software packages available for use. However, these biometrics may have the ability to represent the individual impaired by the phenomenon known as dysphonia, which consists of a change in the sound signal due to some disease that acts on the vocal apparatus. As a consequence, for example, a user with the flu may not be properly authenticated by the recognition system. Therefore, it is important that automatic voice dysphonia detection techniques be developed. In this work, we propose a new framework based on the representation of the voice signal by the multiple projection of cepstral coefficients to promote the detection of dysphonic alterations in the voice through machine learning techniques. Most of the best-known cepstral coefficient extraction techniques in the literature are mapped and analyzed separately and together with measures related to the fundamental frequency of the voice signal, and its representation capacity is evaluated on three classifiers. Finally, the experiments on a subset of the Saarbruecken Voice Database prove the effectiveness of the proposed material in detecting the presence of dysphonia in the voice.

摘要

基于生物特征的认证已经成为在需要一定安全级别的系统中最成熟的用户识别形式。例如,最常见的社交活动就很突出,例如访问工作环境或自己的银行账户。在所有生物特征中,由于采集方便、读取设备成本低以及可用于使用的文献和软件包数量众多等因素,语音受到特别关注。然而,这些生物特征可能能够代表由于称为发声障碍的现象而受损的个体,发声障碍是由于作用于发声器官的某种疾病而导致声音信号发生变化。因此,例如,患有流感的用户可能无法被识别系统正确认证。因此,开发自动语音发声障碍检测技术非常重要。在这项工作中,我们提出了一种新的框架,该框架基于通过复倒谱系数的多次投影来表示语音信号,以通过机器学习技术促进检测语音中的发声障碍变化。文献中最著名的大多数复倒谱系数提取技术被分别映射和分析,并与与语音信号的基频及其表示能力相关的措施一起进行分析,并在三个分类器上评估其表示能力。最后,Saarbruecken 语音数据库的一个子集上的实验证明了所提出的材料在检测语音中发声障碍方面的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84b4/10256083/af27f0304d9b/sensors-23-05196-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验