• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于表征自然语音节奏的多模态频谱方法。

A multimodal spectral approach to characterize rhythm in natural speech.

作者信息

Alexandrou Anna Maria, Saarinen Timo, Kujala Jan, Salmelin Riitta

机构信息

Department of Neuroscience and Biomedical Engineering, Aalto University, FI-00076 AALTO, Finland.

出版信息

J Acoust Soc Am. 2016 Jan;139(1):215-26. doi: 10.1121/1.4939496.

DOI:10.1121/1.4939496
PMID:26827019
Abstract

Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.

摘要

人类话语呈现出时间模式,也被称为节奏。虽然简单的口部运动行为(如咀嚼)具有显著的周期性结构,但对话语音表现出随时间变化的准节奏模式。语音周期性的量化具有挑战性。单峰频谱方法突出了语音的节奏方面。然而,语音是一种复杂的多模态现象,它源于发音、呼吸和发声系统的相互作用。本研究探讨了以肌电图(EMG)和声信号之间的相干分析形式的多模态频谱方法是否比单峰分析更有效地表征自然语音中的节奏这一问题。主要实验任务包括以三种语速进行言语产生;一个简单的口部运动任务作为对照。肌电图 - 声学相干性成为追踪语音节奏的一种敏感手段,而单独对肌电图或声幅包络进行频谱分析的信息量较少。相干度量似乎能够区分并突出自然语音中的节奏结构。

相似文献

1
A multimodal spectral approach to characterize rhythm in natural speech.一种用于表征自然语音节奏的多模态频谱方法。
J Acoust Soc Am. 2016 Jan;139(1):215-26. doi: 10.1121/1.4939496.
2
Rhythmic patterning in Malaysian and Singapore English.马来西亚英语和新加坡英语中的节奏模式。
Lang Speech. 2014 Jun;57(Pt 2):196-214. doi: 10.1177/0023830913496058.
3
Rhythmic structure of Hindi and English: new insights from a computational analysis.印地语和英语的韵律结构:计算分析的新见解。
Prog Brain Res. 2008;168:207-72. doi: 10.1016/S0079-6123(07)68017-0.
4
Speech timing and linguistic rhythm: on the acoustic bases of rhythm typologies.言语时间与语言节奏:基于节奏类型的声学基础
J Acoust Soc Am. 2015 May;137(5):2834. doi: 10.1121/1.4919322.
5
Do rhythm measures reflect perceived rhythm?节奏测量是否反映了感知到的节奏?
Phonetica. 2009;66(1-2):78-94. doi: 10.1159/000208932. Epub 2009 Apr 8.
6
Acoustic and articulatory features of diphthong production: a speech clarity study.双元音产生的声学和发音特征:言语清晰度研究。
J Speech Lang Hear Res. 2010 Feb;53(1):84-99. doi: 10.1044/1092-4388(2009/08-0124). Epub 2009 Nov 30.
7
The effect of speaking context on spectral- and cepstral-based acoustic features of normal voice.说话语境对正常嗓音基于频谱和倒谱的声学特征的影响。
Clin Linguist Phon. 2016;30(1):1-11. doi: 10.3109/02699206.2015.1087049. Epub 2015 Nov 23.
8
Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors.说话者之间的节奏变化:发音、韵律和语言因素。
J Acoust Soc Am. 2015 Mar;137(3):1513-28. doi: 10.1121/1.4906837.
9
Speech rhythm analysis with decomposition of the amplitude envelope: characterizing rhythmic patterns within and across languages.基于幅度包络分解的语音韵律分析:在语言内和跨语言中刻画韵律模式。
J Acoust Soc Am. 2013 Jul;134(1):628-39. doi: 10.1121/1.4807565.
10
Aspects of voice irregularity measurement in connected speech.连贯言语中语音不规则性测量的各个方面。
Folia Phoniatr Logop. 2009;61(3):126-36. doi: 10.1159/000219948. Epub 2009 Jul 1.

引用本文的文献

1
Modulation transfer functions for audiovisual speech.视听语音的调制传递函数。
PLoS Comput Biol. 2022 Jul 19;18(7):e1010273. doi: 10.1371/journal.pcbi.1010273. eCollection 2022 Jul.
2
Correcting MEG Artifacts Caused by Overt Speech.校正由明显言语引起的脑磁图伪迹。
Front Neurosci. 2021 Jun 8;15:682419. doi: 10.3389/fnins.2021.682419. eCollection 2021.
3
Predictive entrainment of natural speech through two fronto-motor top-down channels.通过两条额叶-运动自上而下通道对自然语音进行预测性同步。
Lang Cogn Neurosci. 2018 Sep 26;35(6):739-751. doi: 10.1080/23273798.2018.1506589.
4
Mapping the neuroanatomical impact of very preterm birth across childhood.描绘早产儿在整个儿童期的神经解剖学影响。
Hum Brain Mapp. 2020 Mar;41(4):892-905. doi: 10.1002/hbm.24847. Epub 2019 Nov 5.
5
The Effect of Speech Repetition Rate on Neural Activation in Healthy Adults: Implications for Treatment of Aphasia and Other Fluency Disorders.言语重复率对健康成年人神经激活的影响:对失语症及其他流利性障碍治疗的启示
Front Hum Neurosci. 2018 Feb 27;12:69. doi: 10.3389/fnhum.2018.00069. eCollection 2018.