• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自动检测口语中的韵律边界。

Automatic detection of prosodic boundaries in spontaneous speech.

机构信息

Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel.

Sagol Center for Brain and Mind, Interdisciplinary Center, Herzliya, Israel.

出版信息

PLoS One. 2021 May 3;16(5):e0250969. doi: 10.1371/journal.pone.0250969. eCollection 2021.

DOI:10.1371/journal.pone.0250969
PMID:33939754
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8092678/
Abstract

Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters.

摘要

自动语音识别 (ASR) 和自然语言处理 (NLP) 有望受益于一种有效、简单且可靠的方法,以便自动解析会话语音。解析会话语音的能力主要取决于识别韵律短语之间边界的能力。人类的耳朵可以自然地做到这一点,但在自动方式中,这被证明是非常难以可靠且简单地实现的。迄今为止,人们一直致力于使用各种语言和声学线索来检测短语边界。我们提出了一种不需要模型训练的方法,该方法利用了两种基于 ASR 输出的韵律线索。边界是通过语音速度的不连续性(前边界延长和短语起始加速)和无声停顿来识别的。得到的短语保留了句法有效性,表现出音高重置,并且与手动标记的韵律边界相比表现良好。总的来说,我们的发现支持了这样一种观点,即韵律短语代表了跨越文本和声学参数的连贯模式。

相似文献

1
Automatic detection of prosodic boundaries in spontaneous speech.自动检测口语中的韵律边界。
PLoS One. 2021 May 3;16(5):e0250969. doi: 10.1371/journal.pone.0250969. eCollection 2021.
2
How listeners weight acoustic cues to intonational phrase boundaries.听众如何权衡用于语调短语边界的声学线索。
PLoS One. 2014 Jul 14;9(7):e102166. doi: 10.1371/journal.pone.0102166. eCollection 2014.
3
Neural correlates of prosodic boundary perception in German preschoolers: If pause is present, pitch can go.德国学龄前儿童韵律边界感知的神经关联:如果有停顿,音高可以变化。
Brain Res. 2016 Feb 1;1632:27-33. doi: 10.1016/j.brainres.2015.12.009. Epub 2015 Dec 10.
4
Infants' Processing of Prosodic Cues: Electrophysiological Evidence for Boundary Perception beyond Pause Detection.婴儿对韵律线索的加工:超越停顿检测的边界感知的电生理证据
Lang Speech. 2018 Mar;61(1):153-169. doi: 10.1177/0023830917730590. Epub 2017 Sep 22.
5
Prosodic boundaries in alaryngeal speech.无喉语音中的韵律边界。
Clin Linguist Phon. 2008 Mar;22(3):215-31. doi: 10.1080/02699200701847160.
6
Perception of prosodic hierarchical boundaries in Mandarin Chinese sentences.对汉语句子韵律层级边界的感知。
Neuroscience. 2009 Feb 18;158(4):1416-25. doi: 10.1016/j.neuroscience.2008.10.065. Epub 2008 Nov 24.
7
Pauses and intonational phrasing: ERP studies in 5-month-old German infants and adults.停顿与语调短语:对5个月大的德国婴儿和成年人进行的事件相关电位研究。
J Cogn Neurosci. 2009 Oct;21(10):1988-2006. doi: 10.1162/jocn.2009.21221.
8
Experience with a second language affects the use of fundamental frequency in speech segmentation.对第二语言的体验会影响语音分割中基频的使用。
PLoS One. 2017 Jul 24;12(7):e0181709. doi: 10.1371/journal.pone.0181709. eCollection 2017.
9
Processing prosodic boundaries in natural and hummed speech: an FMRI study.处理自然语音和哼唱语音中的韵律边界:一项功能磁共振成像研究。
Cereb Cortex. 2008 Mar;18(3):541-52. doi: 10.1093/cercor/bhm083. Epub 2007 Jun 24.
10
Identification of vowel length, word stress, and compound words and phrases by postlingually deafened cochlear implant listeners.语后聋人工耳蜗植入者对元音长度、单词重音以及复合词和短语的识别
J Am Acad Audiol. 2013 Oct;24(9):879-90. doi: 10.3766/jaaa.24.9.11.

引用本文的文献

1
A universal of speech timing: Intonation units form low-frequency rhythms.言语节奏的一个普遍特征:语调单位构成低频节奏。
Proc Natl Acad Sci U S A. 2025 Aug 26;122(34):e2425166122. doi: 10.1073/pnas.2425166122. Epub 2025 Aug 19.
2
Structure in conversation: Evidence for the vocabulary, semantics, and syntax of prosody.对话中的结构:韵律的词汇、语义和句法证据。
Proc Natl Acad Sci U S A. 2025 Apr 29;122(17):e2403262122. doi: 10.1073/pnas.2403262122. Epub 2025 Apr 21.
3
Voice Synthesis Improvement by Machine Learning of Natural Prosody.

本文引用的文献

1
Sequences of Intonation Units form a ~ 1 Hz rhythm.语调单元序列形成了一个约 1Hz 的节奏。
Sci Rep. 2020 Sep 28;10(1):15846. doi: 10.1038/s41598-020-72739-4.
2
Hierarchical prosody modeling for Mandarin spontaneous speech.层次韵律建模在汉语自然语音中的应用。
J Acoust Soc Am. 2019 Apr;145(4):2576. doi: 10.1121/1.5099263.
3
The role of prominence in determining the scope of boundary-related lengthening in Greek.重音在确定希腊语中与边界相关的延长范围方面的作用。
通过自然韵律的机器学习改善语音合成。
Sensors (Basel). 2024 Mar 1;24(5):1624. doi: 10.3390/s24051624.
J Phon. 2016 Mar;55:149-181. doi: 10.1016/j.wocn.2015.12.003. Epub 2016 Feb 16.
4
Cortical tracking of hierarchical linguistic structures in connected speech.连贯言语中层次语言结构的皮层追踪。
Nat Neurosci. 2016 Jan;19(1):158-64. doi: 10.1038/nn.4186. Epub 2015 Dec 7.
5
Using event-related potentials to measure phrase boundary perception in English.利用事件相关电位测量英语中的短语边界感知。
BMC Neurosci. 2014 Nov 26;15:129. doi: 10.1186/s12868-014-0129-z.
6
Word length effect in free recall of randomly assembled word lists.随机组合单词列表自由回忆中的词长效应。
Front Comput Neurosci. 2014 Oct 14;8:129. doi: 10.3389/fncom.2014.00129. eCollection 2014.
7
Experimental and theoretical advances in prosody: A review.韵律学的实验与理论进展:综述
Lang Cogn Process. 2010 Jan 1;25(7-9):905-945. doi: 10.1080/01690961003589492.
8
Robust Speech Rate Estimation for Spontaneous Speech.针对自发语音的稳健语速估计
IEEE Trans Audio Speech Lang Process. 2007 Nov 1;15(8):2190-2201. doi: 10.1109/TASL.2007.905178.
9
Variability in word duration as a function of probability, speech style, and prosody.作为概率、言语风格和韵律函数的词长变异性。
Lang Speech. 2009;52(Pt 4):391-413. doi: 10.1177/0023830909336575.
10
Frequency of Basic English Grammatical Structures: A Corpus Analysis.基础英语语法结构的频率:语料库分析
J Mem Lang. 2007 Oct 1;57(3):348-379. doi: 10.1016/j.jml.2007.03.002.