Suppr超能文献

基于机器学习的帕金森病早期检测语音分析:说话人分割和分类技术研究。

Machine Learning-Assisted Speech Analysis for Early Detection of Parkinson's Disease: A Study on Speaker Diarization and Classification Techniques.

机构信息

Department of Engineering and Geology, University G. D'Annunzio of Chieti-Pescara, 65127 Pescara, Italy.

出版信息

Sensors (Basel). 2024 Feb 26;24(5):1499. doi: 10.3390/s24051499.

Abstract

Parkinson's disease (PD) is a neurodegenerative disorder characterized by a range of motor and non-motor symptoms. One of the notable non-motor symptoms of PD is the presence of vocal disorders, attributed to the underlying pathophysiological changes in the neural control of the laryngeal and vocal tract musculature. From this perspective, the integration of machine learning (ML) techniques in the analysis of speech signals has significantly contributed to the detection and diagnosis of PD. Particularly, MEL Frequency Cepstral Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GTCCs) are both feature extraction techniques commonly used in the field of speech and audio signal processing that could exhibit great potential for vocal disorder identification. This study presents a novel approach to the early detection of PD through ML applied to speech analysis, leveraging both MFCCs and GTCCs. The recordings contained in the Mobile Device Voice Recordings at King's College London (MDVR-KCL) dataset were used. These recordings were collected from healthy individuals and PD patients while they read a passage and during a spontaneous conversation on the phone. Particularly, the speech data regarding the spontaneous dialogue task were processed through speaker diarization, a technique that partitions an audio stream into homogeneous segments according to speaker identity. The ML applied to MFCCS and GTCCs allowed us to classify PD patients with a test accuracy of 92.3%. This research further demonstrates the potential to employ mobile phones as a non-invasive, cost-effective tool for the early detection of PD, significantly improving patient prognosis and quality of life.

摘要

帕金森病(PD)是一种神经退行性疾病,其特征是一系列运动和非运动症状。PD 的一个显著非运动症状是存在发声障碍,这归因于喉和声带肌肉神经控制的潜在病理生理变化。从这个角度来看,机器学习(ML)技术在语音信号分析中的整合极大地促进了 PD 的检测和诊断。特别是梅尔频率倒谱系数(MFCCs)和伽马频倒谱系数(GTCCs)都是语音和音频信号处理领域常用的特征提取技术,对于识别发声障碍具有很大的潜力。本研究提出了一种通过 ML 应用于语音分析的 PD 早期检测的新方法,同时利用 MFCCs 和 GTCCs。该研究使用了伦敦国王学院移动设备语音记录(MDVR-KCL)数据集的记录。这些记录是在健康个体和 PD 患者朗读一篇文章以及在电话上进行自发对话时收集的。特别是,针对自发对话任务的语音数据通过说话人分割技术进行处理,该技术根据说话人身份将音频流分割成同质的片段。应用于 MFCCS 和 GTCCs 的 ML 使我们能够以 92.3%的测试准确性对 PD 患者进行分类。这项研究进一步证明了使用手机作为 PD 早期检测的非侵入性、经济有效的工具的潜力,极大地改善了患者的预后和生活质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbb0/10934449/fa74322d4924/sensors-24-01499-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验