Suppr超能文献

一个用于从实时磁共振图像测量声道形状的开源工具箱。

An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance images.

机构信息

Department of Psychology, Edge Hill University, Ormskirk, UK.

Department of Speech Hearing and Phonetic Sciences, University College London, London, UK.

出版信息

Behav Res Methods. 2024 Mar;56(3):2623-2635. doi: 10.3758/s13428-023-02171-9. Epub 2023 Jul 28.

Abstract

Real-time magnetic resonance imaging (rtMRI) is a technique that provides high-contrast videographic data of human anatomy in motion. Applied to the vocal tract, it is a powerful method for capturing the dynamics of speech and other vocal behaviours by imaging structures internal to the mouth and throat. These images provide a means of studying the physiological basis for speech, singing, expressions of emotion, and swallowing that are otherwise not accessible for external observation. However, taking quantitative measurements from these images is notoriously difficult. We introduce a signal processing pipeline that produces outlines of the vocal tract from the lips to the larynx as a quantification of the dynamic morphology of the vocal tract. Our approach performs simple tissue classification, but constrained to a researcher-specified region of interest. This combination facilitates feature extraction while retaining the domain-specific expertise of a human analyst. We demonstrate that this pipeline generalises well across datasets covering behaviours such as speech, vocal size exaggeration, laughter, and whistling, as well as producing reliable outcomes across analysts, particularly among users with domain-specific expertise. With this article, we make this pipeline available for immediate use by the research community, and further suggest that it may contribute to the continued development of fully automated methods based on deep learning algorithms.

摘要

实时磁共振成像(rtMRI)是一种提供人体运动中高对比度视频数据的技术。将其应用于声道,它是一种通过对口腔和喉咙内部结构成像来捕捉言语和其他声音行为动态的强大方法。这些图像提供了一种研究言语、歌唱、情感表达和吞咽等生理基础的方法,而这些方法是无法通过外部观察获得的。然而,从这些图像中进行定量测量是非常困难的。我们引入了一种信号处理管道,该管道从嘴唇到喉咙生成声道的轮廓,作为声道动态形态的定量测量。我们的方法执行简单的组织分类,但仅限于研究人员指定的感兴趣区域。这种组合促进了特征提取,同时保留了人类分析师的特定领域专业知识。我们证明,该管道可以很好地适用于涵盖言语、声音夸张、笑声和口哨等行为的数据集,并且在分析师之间,特别是在具有特定领域专业知识的用户之间,能够产生可靠的结果。通过本文,我们将这个管道立即提供给研究社区使用,并进一步建议它可能有助于基于深度学习算法的全自动方法的持续发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/699f/10990993/1244a046a669/13428_2023_2171_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验