Suppr超能文献

合成元音自然度程度的微扰测量

Perturbation Measurements on the Degree of Naturalness of Synthesized Vowels.

作者信息

Yamasaki Rosiane, Montagnoli Arlindo, Murano Emi Z, Gebrim Eloisa, Hachiya Adriana, Lopes da Silva Jorge Vicente, Behlau Mara, Tsuji Domingos

机构信息

Department of Otorhinolaryngology, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil.

Department of Electrical Engineering, Universidade de São Paulo-São Carlos, Brazil.

出版信息

J Voice. 2017 May;31(3):389.e1-389.e8. doi: 10.1016/j.jvoice.2016.09.020. Epub 2016 Oct 21.

Abstract

OBJECTIVE

To determine the impact of jitter and shimmer on the degree of naturalness perception of synthesized vowels produced by acoustical simulation with glottal pulses (GP) and with solid model of the vocal tract (SMVT).

STUDY DESIGN

Prospective study.

METHODS

Synthesized vowels were produced in three steps: 1. Eighty GP were developed (20 with jitter, 20 with shimmer, 20 with jitter+shimmer, 20 without perturbation); 2. A SMVT was produced based on magnetic resonance imaging (MRI) from a woman during phonation-/ε/ and using rapid prototyping technology; 3. Acoustic simulations were performed to obtain eighty synthesized vowels-/ε /. Two experiments were performed. First Experiment: three judges rated 120 vowels (20 humans+80 synthesized+20% repetition) as "human" or "synthesized". Second Experiment: twenty PowerPoint slide sequences were created. Each slide had 4 synthesized vowels produced with the four perturbation condition. Evaluators were asked to rate the vowels from the most natural to the most artificial.

RESULTS

First Experiment: all the human vowels were classified as human; 27 out of eighty synthesized vowels were rated as human, 15 of those were produced with jitter+shimmer, 10 with jitter, 2 without perturbation and none with shimmer. Second Experiment: Vowels produced with jitter+shimmer were considered as the most natural. Vowels with shimmer and without perturbation were considered as the most artificial.

CONCLUSIONS

The association of jitter and shimmer increased the degree of naturalness of synthesized vowels. Acoustic simulations performed with GP and using SMVT demonstrated a possible method to test the effect of the perturbation measurements on synthesized voices.

摘要

目的

确定抖动和闪烁对通过声门脉冲(GP)和声道实体模型(SMVT)进行声学模拟产生的合成元音自然度感知程度的影响。

研究设计

前瞻性研究。

方法

合成元音分三步产生:1. 生成80个声门脉冲(20个带有抖动,20个带有闪烁,20个带有抖动 + 闪烁,20个无扰动);2. 根据一名女性发 /ε/ 音时的磁共振成像(MRI)并使用快速成型技术制作一个声道实体模型;3. 进行声学模拟以获得80个合成元音 /ε/。进行了两个实验。第一个实验:三名评判员将120个元音(20个人类元音 + 80个合成元音 + 20%重复)评定为“人类的”或“合成的”。第二个实验:创建了20个PowerPoint幻灯片序列。每张幻灯片有4个在四种扰动条件下产生的合成元音。要求评估者将元音从最自然到最不自然进行排序。

结果

第一个实验:所有人类元音都被归类为人类的;80个合成元音中有27个被评定为人类的,其中15个是带有抖动 + 闪烁产生的,10个是带有抖动产生的,2个是无扰动产生的,没有一个是带有闪烁产生的。第二个实验:带有抖动 + 闪烁产生的元音被认为是最自然的。带有闪烁和无扰动的元音被认为是最不自然的。

结论

抖动和闪烁的结合提高了合成元音的自然度。使用声门脉冲和声道实体模型进行的声学模拟展示了一种测试扰动测量对合成语音影响的可能方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验