Doyle Philip C, Ghasemzadeh Hamzeh, Searl Jeff
Otolaryngology Head and Neck Surgery, Division of Laryngology, School of Medicine Stanford University, Stanford University, Stanford, CA 94305, USA.
Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA.
Appl Sci (Basel). 2024 Jan 1;14(1). doi: 10.3390/app14010214. Epub 2023 Dec 26.
This study pursued two objectives: (1) to determine the potential association between listener ( = 51) judgments of 20 male tracheoesophageal speaker samples for two auditory-perceptual dimensions of voice, overall severity (OS) and listener comfort (LC); and (2) to assess the temporal and spectral acoustic correlates for these auditory-perceptual dimensions.
Three separate correlation analyses were performed to evaluate the association between OS and LC. First, scores of OS and LC from all listeners were pooled together, and then the correlation between OS and LC was computed. Second, scores of OS and LC were averaged over all listeners to derive a single estimate of OS and LC for each TE speaker sample; the correlation between the average OS and LC was then computed. Third, listener-to-listener variability in the association between OS and LC was evaluated by computing the correlation between OS and LC scores from each listener across all TE samples. Finally, two stepwise multiple regression models were created to relate the average LC score to spectral and temporal variation in the acoustic signal.
While the pooled OS and LC scores had a moderate positive correlation (r = 0.66, < 0.00001), the averaged OS and LC exhibited a near perfect positive correlation (r = 0.99, < 0.00001). The significant differences between the pooled and averaged scores were explained by significant listener-to-listener variability in the association between OS and LC. OS and LC scores from 5 listeners had non-significant correlations, 10 had moderate correlations (r < 0.7), 35 listeners had high correlations (0.7 < r < 0.9), and 1 listener had a very high correlation (r < 0.9 < 1). Finally, the acoustic models created based on the spectral and temporal variations in the signal were able to account for 87.7% and 61.8% of variation in the average LC score.
The strong correlations between OS and LC suggest that LC may, in fact, provide a more comprehensive auditory-perceptual surrogate for the voice quality of TE speakers. Although OS and LC are distinct conceptual dimensions, LC appears to have the advantage of assessing the social impact and potential communication disability that may exist in interactions between TE speakers and listeners.
本研究有两个目标:(1)确定51位听众对20个男性气管食管发音者样本在语音的两个听觉感知维度,即总体严重程度(OS)和听众舒适度(LC)上的判断之间的潜在关联;(2)评估这些听觉感知维度的时间和频谱声学相关性。
进行了三项独立的相关性分析,以评估OS和LC之间的关联。首先,将所有听众的OS和LC得分汇总在一起,然后计算OS和LC之间的相关性。其次,将OS和LC得分在所有听众中进行平均,以得出每个气管食管发音者样本的OS和LC的单一估计值;然后计算平均OS和LC之间的相关性。第三,通过计算所有气管食管样本中每个听众的OS和LC得分之间的相关性,评估OS和LC之间关联的听众间变异性。最后,创建了两个逐步多元回归模型,将平均LC得分与声学信号的频谱和时间变化相关联。
虽然汇总的OS和LC得分具有中等程度的正相关(r = 0.66,<0.00001),但平均的OS和LC呈现出近乎完美的正相关(r = 0.99,<0.00001)。汇总得分和平均得分之间的显著差异是由OS和LC之间关联中显著的听众间变异性所解释的。5位听众的OS和LC得分具有不显著的相关性,10位听众具有中等相关性(r <0.7),35位听众具有高度相关性(0.7 <r <0.9),1位听众具有非常高的相关性(r <0.9 <1)。最后,基于信号的频谱和时间变化创建的声学模型能够解释平均LC得分变化的87.7%和61.8%。
OS和LC之间的强相关性表明,事实上,LC可能为气管食管发音者的语音质量提供更全面的听觉感知替代指标。虽然OS和LC是不同的概念维度,但LC似乎具有评估气管食管发音者与听众互动中可能存在的社会影响和潜在沟通障碍的优势。