使用语音中时间对齐的音频波形构建 4D 磁共振成像图谱

4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech.

机构信息

Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA.

Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA.

出版信息

J Acoust Soc Am. 2021 Nov;150(5):3500. doi: 10.1121/10.0007064.

DOI:10.1121/10.0007064

PMID:34852570

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8580575/

Abstract

Magnetic resonance (MR) imaging is becoming an established tool in capturing articulatory and physiological motion of the structures and muscles throughout the vocal tract and enabling visual and quantitative assessment of real-time speech activities. Although motion capture speed has been regularly improved by the continual developments in high-speed MR technology, quantitative analysis of multi-subject group data remains challenging due to variations in speaking rate and imaging time among different subjects. In this paper, a workflow of post-processing methods that matches different MR image datasets within a study group is proposed. Each subject's recorded audio waveform during speech is used to extract temporal domain information and generate temporal alignment mappings from their matching pattern. The corresponding image data are resampled by deformable registration and interpolation of the deformation fields, achieving inter-subject temporal alignment between image sequences. A four-dimensional dynamic MR speech atlas is constructed using aligned volumes from four human subjects. Similarity tests between subject and target domains using the squared error, cross correlation, and mutual information measures all show an overall score increase after spatiotemporal alignment. The amount of image variability in atlas construction is reduced, indicating a quality increase in the multi-subject data for groupwise quantitative analysis.

摘要

磁共振（MR）成像是一种成熟的工具，用于捕捉整个声道结构和肌肉的发音和生理运动，并实现实时言语活动的可视化和定量评估。尽管高速磁共振技术的不断发展使运动捕捉速度得到了定期提高，但由于不同受试者的说话速度和成像时间存在差异，多受试者组数据的定量分析仍然具有挑战性。在本文中，提出了一种在研究组内匹配不同磁共振图像数据集的后处理方法工作流程。每个受试者在说话过程中记录的音频波形用于提取时域信息，并根据其匹配模式生成时间对准映射。通过变形配准和变形场的插值对相应的图像数据进行重采样，从而实现图像序列的受试者间时间对准。使用来自四个受试者的对齐体积构建了一个四维动态磁共振言语图谱。使用均方误差、互相关和互信息度量对目标域和目标域之间的相似性测试均显示，在时空配准后整体得分有所提高。图谱构建中图像的可变性减少，表明多受试者数据的质量提高，可用于组间定量分析。

相似文献

4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech.

J Acoust Soc Am. 2021 Nov;150(5):3500. doi: 10.1121/10.0007064.

Quantifying Velopharyngeal Motion Variation in Speech Sound Production Using an Audio-Informed Dynamic MRI Atlas.

Proc SPIE Int Soc Opt Eng. 2023 Feb;12464. doi: 10.1117/12.2654082. Epub 2023 Apr 3.

A Four-dimensional Motion Field Atlas of the Tongue from Tagged and Cine Magnetic Resonance Imaging.

Proc SPIE Int Soc Opt Eng. 2017;10133. doi: 10.1117/12.2254363. Epub 2017 Feb 24.

Construction of An Unbiased Spatio-Temporal Atlas of the Tongue During Speech.

Inf Process Med Imaging. 2015;24:723-32. doi: 10.1007/978-3-319-19992-4_57.

A novel framework for longitudinal atlas construction with groupwise registration of subject image sequences.

Neuroimage. 2012 Jan 16;59(2):1275-89. doi: 10.1016/j.neuroimage.2011.07.095. Epub 2011 Aug 22.

Spatio-temporal (2D+T) non-rigid registration of real-time 3D echocardiography and cardiovascular MR image sequences.

Phys Med Biol. 2011 Mar 7;56(5):1341-60. doi: 10.1088/0031-9155/56/5/008. Epub 2011 Feb 4.

Hierarchical alignment of breast DCE-MR images by groupwise registration and robust feature matching.

Med Phys. 2012 Jan;39(1):353-66. doi: 10.1118/1.3665705.

Noise reduction and motion elimination in low-dose 4D myocardial computed tomography perfusion (CTP): preliminary clinical evaluation of the ASTRA4D algorithm.

Eur Radiol. 2019 Sep;29(9):4572-4582. doi: 10.1007/s00330-018-5899-8. Epub 2019 Feb 4.

Data-driven respiratory motion compensation for four-dimensional cone-beam computed tomography (4D-CBCT) using groupwise deformable registration.

Med Phys. 2018 Oct;45(10):4471-4482. doi: 10.1002/mp.13133. Epub 2018 Sep 18.

Groupwise registration with global-local graph shrinkage in atlas construction.

Med Image Anal. 2020 Aug;64:101711. doi: 10.1016/j.media.2020.101711. Epub 2020 Jun 10.

引用本文的文献

Quantifying articulatory variations across phonological environments: An atlas-based approach using dynamic magnetic resonance imaging.

J Acoust Soc Am. 2024 Dec 1;156(6):4000-4009. doi: 10.1121/10.0034639.

Sex Differences in Velopharyngeal Anatomy of 9- and 10-Year-Old Children.

J Speech Lang Hear Res. 2023 Dec 11;66(12):4828-4837. doi: 10.1044/2023_JSLHR-23-00279. Epub 2023 Oct 30.

Optimization of 3D dynamic speech MRI: Poisson-disc undersampling and locally higher-rank reconstruction through partial separability model with regional optimized temporal basis.

Magn Reson Med. 2024 Jan;91(1):61-74. doi: 10.1002/mrm.29812. Epub 2023 Sep 7.

Quantifying Velopharyngeal Motion Variation in Speech Sound Production Using an Audio-Informed Dynamic MRI Atlas.

Proc SPIE Int Soc Opt Eng. 2023 Feb;12464. doi: 10.1117/12.2654082. Epub 2023 Apr 3.

Establishing a Clinical Protocol for Velopharyngeal MRI and Interpreting Imaging Findings.

Cleft Palate Craniofac J. 2024 May;61(5):748-758. doi: 10.1177/10556656221141188. Epub 2022 Nov 30.

本文引用的文献

Speech Map: A Statistical Multimodal Atlas of 4D Tongue Motion During Speech from Tagged and Cine MR Images.

Comput Methods Biomech Biomed Eng Imaging Vis. 2019;7(4):361-373. doi: 10.1080/21681163.2017.1382393. Epub 2017 Oct 9.

Atlas-Based Tongue Muscle Correlation Analysis From Tagged and High-Resolution Magnetic Resonance Imaging.

J Speech Lang Hear Res. 2019 Jul 15;62(7):2258-2269. doi: 10.1044/2019_JSLHR-S-18-0495. Epub 2019 Jul 2.

3D dynamic MRI of the vocal tract during natural speech.

Magn Reson Med. 2019 Mar;81(3):1511-1520. doi: 10.1002/mrm.27570. Epub 2018 Nov 3.

Acoustic Denoising using Dictionary Learning with Spectral and Temporal Regularization.

IEEE/ACM Trans Audio Speech Lang Process. 2018 May;26(5):967-980. doi: 10.1109/TASLP.2018.2800280. Epub 2018 Jan 31.

A Spatio-Temporal Atlas and Statistical Model of the Tongue During Speech from Cine-MRI.

Comput Methods Biomech Biomed Eng Imaging Vis. 2018;6(5):520-531. doi: 10.1080/21681163.2016.1169220. Epub 2016 Apr 28.

Analysis of 3-D Tongue Motion From Tagged and Cine Magnetic Resonance Images.

J Speech Lang Hear Res. 2016 Jun 1;59(3):468-79. doi: 10.1044/2016_JSLHR-S-14-0155.

High-frame-rate full-vocal-tract 3D dynamic speech imaging.

Magn Reson Med. 2017 Apr;77(4):1619-1629. doi: 10.1002/mrm.26248. Epub 2016 Apr 21.

Construction of An Unbiased Spatio-Temporal Atlas of the Tongue During Speech.

Inf Process Med Imaging. 2015;24:723-32. doi: 10.1007/978-3-319-19992-4_57.

Recommendations for real-time speech MRI.

J Magn Reson Imaging. 2016 Jan;43(1):28-44. doi: 10.1002/jmri.24997. Epub 2015 Jul 14.

High-resolution dynamic speech imaging with joint low-rank and sparsity constraints.

Magn Reson Med. 2015 May;73(5):1820-32. doi: 10.1002/mrm.25302. Epub 2014 Jun 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用语音中时间对齐的音频波形构建 4D 磁共振成像图谱

4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献