• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过结构化预测自动测量元音时长。

Automatic measurement of vowel duration via structured prediction.

作者信息

Adi Yossi, Keshet Joseph, Cibelli Emily, Gustafson Erin, Clopper Cynthia, Goldrick Matthew

机构信息

Department of Computer Science, Bar-Ilan University, Ramat-Gan, 52900, Israel.

Department of Linguistics, Northwestern University, Evanston, Illinois 60208, USA.

出版信息

J Acoust Soc Am. 2016 Dec;140(6):4517. doi: 10.1121/1.4972527.

DOI:10.1121/1.4972527
PMID:28040034
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5392101/
Abstract

A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it outperformed the current gold standard for duration measurement, an hidden Markov model-based forced aligner (which requires orthographic or phonetic transcription as an input).

摘要

使语音研究具有可扩展性和可重复性的一个关键障碍是需要依赖主观的人工标注。为了应对这一挑战,开发了一种机器学习算法,用于自动测量一种广泛使用的语音指标:元音时长。使用人工标注的数据来训练一个模型,该模型将包含单个元音且前后都有辅音的任意长度的声学信号段作为输入,并输出元音的时长。该模型基于结构化预测框架。输入信号以及一组假设的元音起始和结束点通过一组声学特征函数被映射到一个抽象向量空间。学习算法在这个空间中进行训练,以最小化预测的元音时长和人工测量的元音时长之间的期望差异。经过训练的模型随后可以在无需语音或正字法转录的情况下自动估计元音时长。将该模型与三组人工标注数据进行比较的结果表明,它的表现优于当前时长测量的黄金标准——基于隐马尔可夫模型的强制对齐器(该对齐器需要正字法或语音转录作为输入)。

相似文献

1
Automatic measurement of vowel duration via structured prediction.通过结构化预测自动测量元音时长。
J Acoust Soc Am. 2016 Dec;140(6):4517. doi: 10.1121/1.4972527.
2
VOWEL DURATION MEASUREMENT USING DEEP NEURAL NETWORKS.使用深度神经网络进行元音时长测量。
IEEE Int Workshop Mach Learn Signal Process. 2015 Sep;2015. doi: 10.1109/MLSP.2015.7324331. Epub 2015 Nov 12.
3
Automatic measurement of voice onset time using discriminative structured prediction.基于判别结构预测的语音起始时间自动测量。
J Acoust Soc Am. 2012 Dec;132(6):3965-79. doi: 10.1121/1.4763995.
4
Converging sources of evidence on spoken and perceived rhythms of speech: cyclic production of vowels in monosyllabic stress feet.关于言语的发音节奏和感知节奏的证据来源趋同:单音节重音音步中元音的循环产生。
J Exp Psychol Gen. 1983 Sep;112(3):386-412. doi: 10.1037//0096-3445.112.3.386.
5
Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech.使用电话语音的声学-语音特征的特征外推在辅音-元音环境中对停顿位置进行分类。
J Acoust Soc Am. 2012 Feb;131(2):1536-46. doi: 10.1121/1.3672706.
6
Production and perception of vowel length in spoken sentences.口语句子中元音长度的产生与感知。
J Acoust Soc Am. 1980 Jan;67(1):276-87. doi: 10.1121/1.383737.
7
American and Swedish children's acquisition of vowel duration: effects of vowel identity and final stop voicing.美国和瑞典儿童对元音时长的习得:元音特性和词末塞音浊音化的影响。
J Acoust Soc Am. 2002 Apr;111(4):1854-64. doi: 10.1121/1.1463448.
8
Automatic analysis of slips of the tongue: Insights into the cognitive architecture of speech production.口误的自动分析:对言语产生认知结构的洞察
Cognition. 2016 Apr;149:31-9. doi: 10.1016/j.cognition.2016.01.002. Epub 2016 Jan 9.
9
Contextual effects on vowel duration, closure duration, and the consonant/vowel ratio in speech production.语境对言语产生中元音时长、闭塞时长及辅音/元音比率的影响。
J Acoust Soc Am. 1985 Dec;78(6):1949-57. doi: 10.1121/1.392651.
10
Acoustic vowel reduction in Creek: effects of distinctive length and position in the word.克里克语中的元音弱化:区别性长度和在单词中位置的影响
Phonetica. 2001 Jan-Jun;58(1-2):81-102. doi: 10.1159/000028489.

引用本文的文献

1
The influence of lexical selection disruptions on articulation.词汇选择障碍对发音的影响。
J Exp Psychol Learn Mem Cogn. 2019 Jun;45(6):1107-1141. doi: 10.1037/xlm0000633. Epub 2018 Jul 19.

本文引用的文献

1
VOWEL DURATION MEASUREMENT USING DEEP NEURAL NETWORKS.使用深度神经网络进行元音时长测量。
IEEE Int Workshop Mach Learn Signal Process. 2015 Sep;2015. doi: 10.1109/MLSP.2015.7324331. Epub 2015 Nov 12.
2
Erratum to: Grammatical constraints on phonological encoding in speech production.《言语产生中语音编码的语法限制》勘误
Psychon Bull Rev. 2015 Oct;22(5):1475. doi: 10.3758/s13423-015-0928-y.
3
Effects of local lexical competition and regional dialect on vowel production.局部词汇竞争和地区方言对元音发音的影响。
J Acoust Soc Am. 2014 Jul;136(1):1-4. doi: 10.1121/1.4883478.
4
Grammatical constraints on phonological encoding in speech production.言语产生中语音编码的语法限制。
Psychon Bull Rev. 2014 Dec;21(6):1576-82. doi: 10.3758/s13423-014-0616-3.
5
Random effects structure for confirmatory hypothesis testing: Keep it maximal.用于验证性假设检验的随机效应结构:保持其最大化。
J Mem Lang. 2013 Apr;68(3). doi: 10.1016/j.jml.2012.11.001.
6
Automatic measurement of voice onset time using discriminative structured prediction.基于判别结构预测的语音起始时间自动测量。
J Acoust Soc Am. 2012 Dec;132(6):3965-79. doi: 10.1121/1.4763995.
7
Acoustic characteristics of American English vowels.美式英语元音的声学特征。
J Acoust Soc Am. 1995 May;97(5 Pt 1):3099-111. doi: 10.1121/1.411872.