Nylén Fredrik
Department of Clinical Science, Faculty of Medicine, Umeå University, Umeå, Västerbotten, Sweden.
Front Hum Neurosci. 2025 Apr 28;19:1566274. doi: 10.3389/fnhum.2025.1566274. eCollection 2025.
This study aimed to determine the acoustic properties most indicative of dysprosody severity in patients with Parkinson's disease using an automated acoustic assessment procedure.
A total of 108 read speech recordings of 68 speakers with PD (45 male, 23 female, aged 65.0 ± 9.8 yea) were made with active levodopa treatment. A total of 40 of the patients were additionally recorded without levodopa treatment to increase the range of dysprosody severity in the sample. Four human clinical experts independently assessed the patients' recordings in terms of dysprosody severity. Separately, a speech processing pipeline extracted the acoustic properties of prosodic relevance from automatically identified portions of speech used as utterance proxies. Five machine learning models were trained on 75% of speech portions and the perceptual evaluations of the speaker's dysprosody severity in a 10-fold cross-validation procedure. They were evaluated regarding their ability to predict the perceptual assessments of recordings excluded during training. The models' performances were assessed by their ability to accurately predict clinical experts' dysprosody severity assessments.
The acoustic predictors of importance spanned several acoustic domains of prosodic relevance, with the variability in change between intonational turning points and the average first Mel-frequency cepstral coefficient at these points being the two top predictors. While predominant in the literature, variability in utterance-wide was und to be only the fifth strongest predictor.
Human expert raters' assessments of dysprosody can be approximated by the automated procedure, affording application in clinical settings where an experienced expert is unavailable. Variability in pitch does not adequately describe the level of dysprosody due to Parkinson's disease.
本研究旨在使用自动声学评估程序确定帕金森病患者中最能指示韵律障碍严重程度的声学特性。
对68名帕金森病患者(45名男性,23名女性,年龄65.0±9.8岁)在左旋多巴有效治疗期间进行了总共108次朗读语音录音。另外,对40名患者在未使用左旋多巴治疗时进行了录音,以扩大样本中韵律障碍严重程度的范围。四名临床专家独立评估患者录音的韵律障碍严重程度。另外,一个语音处理管道从自动识别的语音部分中提取与韵律相关的声学特性,这些语音部分用作话语代理。五个机器学习模型在75%的语音部分以及在10倍交叉验证程序中对说话者韵律障碍严重程度的感知评估上进行训练。评估它们预测训练期间排除的录音的感知评估的能力。通过它们准确预测临床专家对韵律障碍严重程度评估的能力来评估模型的性能。
重要的声学预测指标跨越了几个与韵律相关的声学领域,其中语调转折点之间的变化以及这些点处的平均第一梅尔频率倒谱系数的变化是两个最主要的预测指标。虽然在文献中占主导地位,但整个话语中的变化被发现只是第五强的预测指标。
自动程序可以近似人类专家评分者对韵律障碍的评估,从而能够在没有经验丰富专家的临床环境中应用。音高变化不能充分描述帕金森病所致韵律障碍的程度。