Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia.
Division of Electronics Engineering, School of Engineering, Cochin University of Science and Technology, India.
Comput Math Methods Med. 2022 Apr 4;2022:7814952. doi: 10.1155/2022/7814952. eCollection 2022.
Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the "continuous sentence" audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry.
除了声带之外,其他内脏器官的疾病也可能会影响人的声音。因此,尽管声音问题经常被忽视,但它们的发生率却在上升。最近的一项研究表明,语音病理检测系统可以成功帮助评估语音异常,并实现语音病理的早期诊断。例如,在早期识别和诊断声音问题时,区分健康和患病声音的自动系统引起了广泛关注。因此,人工智能辅助的语音分析为医疗保健带来了新的可能性。这项工作旨在评估几种自动语音信号分析方法在诊断语音障碍方面的效用,并提出一种健康和患病声音分类的策略。所提出的框架集成了三种声音特征的功效:色度、梅尔频谱和梅尔频率倒谱系数(MFCC)。我们还设计了一个深度神经网络(DNN),能够从检索到的数据中学习,并生成一个高度准确的基于语音的疾病预测模型。该研究描述了一系列使用 Saarbruecken 语音数据库(SVD)来检测异常声音的研究。该模型使用高、中、低三个音高发音的元音 /a/、/i/ 和 /u/ 进行开发和测试。我们还保留了从 SVD 收集的“连续句子”音频文件,以选择开发的模型对全新数据的泛化程度。最高达到的准确率为 77.49%,优于同领域的先前尝试。此外,通过集成说话人性别信息,该模型的准确率达到 88.01%。在选定疾病上训练的设计模型也可以获得 96.77%的最大准确率(声带切除术×健康)。因此,所提出的框架最适合医疗保健行业。