• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator.通过自残差注意力引导的异构翻译器实现标记磁共振成像序列到音频合成
Med Image Comput Comput Assist Interv. 2022 Sep;13436:376-386. doi: 10.1007/978-3-031-16446-0_36. Epub 2022 Sep 17.
2
CMRI2SPEC: CINE MRI SEQUENCE TO SPECTROGRAM SYNTHESIS VIA A PAIRWISE HETEROGENEOUS TRANSLATOR.CMRI2SPEC:通过成对异构翻译器将电影磁共振成像序列合成到频谱图
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:1481-1485. doi: 10.1109/icassp43922.2022.9746381. Epub 2022 Apr 27.
3
Synthesizing Audio from Tongue Motion During Speech Using Tagged MRI Via Transformer.通过带标记的磁共振成像利用Transformer从言语中的舌运动合成音频
Proc SPIE Int Soc Opt Eng. 2023 Feb;12464. doi: 10.1117/12.2653345. Epub 2023 Apr 3.
4
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer.通过塑性变压器从标记磁共振成像和非负矩阵分解进行语音音频合成。
Med Image Comput Comput Assist Interv. 2023 Oct;14226:435-445. doi: 10.1007/978-3-031-43990-2_41. Epub 2023 Oct 1.
5
Speech Motion Anomaly Detection via Cross-Modal Translation of 4D Motion Fields from Tagged MRI.通过标记MRI的4D运动场跨模态翻译进行语音运动异常检测。
Proc SPIE Int Soc Opt Eng. 2024 Feb;12926. doi: 10.1117/12.3006874. Epub 2024 May 1.
6
End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks.端到端的基于生成对抗网络的视频到语音合成。
IEEE Trans Cybern. 2023 Jun;53(6):3454-3466. doi: 10.1109/TCYB.2022.3162495. Epub 2023 May 17.
7
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks.CiwGAN 和 fiwGAN:利用生成对抗网络将声学数据中的信息编码,以建模词汇学习。
Neural Netw. 2021 Jul;139:305-325. doi: 10.1016/j.neunet.2021.03.017. Epub 2021 Mar 19.
8
A multimodal dynamical variational autoencoder for audiovisual speech representation learning.一种用于视听语音表示学习的多模态动态变分自编码器。
Neural Netw. 2024 Apr;172:106120. doi: 10.1016/j.neunet.2024.106120. Epub 2024 Jan 11.
9
Structure-aware Unsupervised Tagged-to-Cine MRI Synthesis with Self Disentanglement.基于自解缠的结构感知无监督标记到电影MRI合成
Proc SPIE Int Soc Opt Eng. 2022 Feb-Mar;12032. doi: 10.1117/12.2610655. Epub 2022 Apr 4.
10
Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis.用于跨域无监督标记到电影MRI合成的生成式自训练
Med Image Comput Comput Assist Interv. 2021;12903:138-148. doi: 10.1007/978-3-030-87199-4_13. Epub 2021 Sep 21.

引用本文的文献

1
Speech Motion Anomaly Detection via Cross-Modal Translation of 4D Motion Fields from Tagged MRI.通过标记MRI的4D运动场跨模态翻译进行语音运动异常检测。
Proc SPIE Int Soc Opt Eng. 2024 Feb;12926. doi: 10.1117/12.3006874. Epub 2024 May 1.

本文引用的文献

1
CMRI2SPEC: CINE MRI SEQUENCE TO SPECTROGRAM SYNTHESIS VIA A PAIRWISE HETEROGENEOUS TRANSLATOR.CMRI2SPEC:通过成对异构翻译器将电影磁共振成像序列合成到频谱图
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:1481-1485. doi: 10.1109/icassp43922.2022.9746381. Epub 2022 Apr 27.
2
Structure-aware Unsupervised Tagged-to-Cine MRI Synthesis with Self Disentanglement.基于自解缠的结构感知无监督标记到电影MRI合成
Proc SPIE Int Soc Opt Eng. 2022 Feb-Mar;12032. doi: 10.1117/12.2610655. Epub 2022 Apr 4.
3
Brain MR Atlas Construction Using Symmetric Deep Neural Inpainting.基于对称深度神经网络修复的脑磁共振图谱构建
IEEE J Biomed Health Inform. 2022 Jul;26(7):3185-3196. doi: 10.1109/JBHI.2022.3149754. Epub 2022 Jul 1.
4
Deep 3D-CNN for Depression Diagnosis with Facial Video Recording of Self-Rating Depression Scale Questionnaire.基于自评抑郁量表问卷的面部视频记录的深度 3D-CNN 用于抑郁症诊断。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2007-2010. doi: 10.1109/EMBC46164.2021.9630412.
5
Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis.用于跨域无监督标记到电影MRI合成的生成式自训练
Med Image Comput Comput Assist Interv. 2021;12903:138-148. doi: 10.1007/978-3-030-87199-4_13. Epub 2021 Sep 21.
6
DUAL-CYCLE CONSTRAINED BIJECTIVE VAE-GAN FOR TAGGED-TO-CINE MAGNETIC RESONANCE IMAGE SYNTHESIS.用于标记到电影磁共振图像合成的双循环约束双射变分自编码器-生成对抗网络
Proc IEEE Int Symp Biomed Imaging. 2021 Apr;2021. doi: 10.1109/isbi48211.2021.9433852. Epub 2021 May 25.
7
Symmetric-Constrained Irregular Structure Inpainting for Brain MRI Registration with Tumor Pathology.用于脑磁共振成像与肿瘤病理学配准的对称约束不规则结构修复
Brainlesion. 2021;12658:80-91. doi: 10.1007/978-3-030-72084-1_8. Epub 2021 Mar 27.
8
Mutual Information Regularized Feature-Level Frankenstein for Discriminative Recognition.互信息正则化特征级弗兰肯斯坦用于判别识别。
IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5243-5260. doi: 10.1109/TPAMI.2021.3077397. Epub 2022 Aug 4.
9
Automated interpretation of congenital heart disease from multi-view echocardiograms.多视图超声心动图中先天性心脏病的自动解读。
Med Image Anal. 2021 Apr;69:101942. doi: 10.1016/j.media.2020.101942. Epub 2020 Dec 26.
10
3D tongue motion from tagged and cine MR images.来自标记和电影磁共振图像的三维舌运动
Med Image Comput Comput Assist Interv. 2013;16(Pt 3):41-8. doi: 10.1007/978-3-642-40760-4_6.

通过自残差注意力引导的异构翻译器实现标记磁共振成像序列到音频合成

Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator.

作者信息

Liu Xiaofeng, Xing Fangxu, Prince Jerry L, Zhuo Jiachen, Stone Maureen, Fakhri Georges El, Woo Jonghye

机构信息

Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.

Johns Hopkins University, Baltimore, MD, USA.

出版信息

Med Image Comput Comput Assist Interv. 2022 Sep;13436:376-386. doi: 10.1007/978-3-031-16446-0_36. Epub 2022 Sep 17.

DOI:10.1007/978-3-031-16446-0_36
PMID:36820764
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9942274/
Abstract

Understanding the underlying relationship between tongue and oropharyngeal muscle deformation seen in tagged-MRI and intelligible speech plays an important role in advancing speech motor control theories and treatment of speech related-disorders. Because of their heterogeneous representations, however, direct mapping between the two modalities-i.e., two-dimensional (mid-sagittal slice) plus time tagged-MRI sequence and its corresponding one-dimensional waveform-is not straightforward. Instead, we resort to two-dimensional spectrograms as an intermediate representation, which contains both pitch and resonance, from which to develop an end-to-end deep learning framework to translate from a sequence of tagged-MRI to its corresponding audio waveform with limited dataset size. Our framework is based on a novel fully convolutional asymmetry translator with guidance of a self residual attention strategy to specifically exploit the moving muscular structures during speech. In addition, we leverage a pairwise correlation of the samples with the same utterances with a latent space representation disentanglement strategy. Furthermore, we incorporate an adversarial training approach with generative adversarial networks to offer improved realism on our generated spectrograms. Our experimental results, carried out with a total of 63 tagged-MRI sequences alongside speech acoustics, showed that our framework enabled the generation of clear audio waveforms from a sequence of tagged-MRI, surpassing competing methods. Thus, our framework provides the great potential to help better understand the relationship between the two modalities.

摘要

了解在标记磁共振成像(tagged-MRI)中看到的舌头与口咽肌肉变形和可理解语音之间的潜在关系,对于推进语音运动控制理论和治疗与语音相关的疾病具有重要作用。然而,由于它们的表示形式各异,这两种模态之间的直接映射——即二维(正中矢状切片)加时间标记的MRI序列及其相应的一维波形——并非易事。相反,我们采用二维频谱图作为中间表示,它同时包含音高和共振信息,据此开发一个端到端的深度学习框架,以便在数据集规模有限的情况下,将标记MRI序列转换为其相应的音频波形。我们的框架基于一种新颖的全卷积不对称转换器,并采用自残差注意力策略进行引导,以专门利用语音过程中移动的肌肉结构。此外,我们利用具有潜在空间表示解缠策略的相同话语样本的成对相关性。此外,我们将对抗训练方法与生成对抗网络相结合,以提高生成频谱图的真实感。我们总共使用63个标记MRI序列以及语音声学进行的实验结果表明,我们的框架能够从标记MRI序列生成清晰的音频波形,超过了其他竞争方法。因此,我们的框架具有巨大潜力,有助于更好地理解这两种模态之间的关系。