自动为超声图像添加字幕。

Captioning Ultrasound Images Automatically.

作者信息

Alsharid Mohammad, Sharma Harshita, Drukker Lior, Chatelain Pierre, Papageorghiou Aris T, Noble J Alison

机构信息

University of Oxford, Oxford, UK.

出版信息

Med Image Comput Comput Assist Interv. 2019 Oct;22:338-346. doi: 10.1007/978-3-030-32251-9_37. Epub 2019 Oct 10.

DOI:10.1007/978-3-030-32251-9_37

PMID:31976493

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6978141/

Abstract

We describe an automatic natural language processing (NLP)-based image captioning method to describe fetal ultrasound video content by modelling the vocabulary commonly used by sonographers and sonologists. The generated captions are similar to the words spoken by a sonographer when describing the scan experience in terms of visual content and performed scanning actions. Using full-length second-trimester fetal ultrasound videos and text derived from accompanying expert voice-over audio recordings, we train deep learning models consisting of convolutional neural networks and recurrent neural networks in merged configurations to generate captions for ultrasound video frames. We evaluate different model architectures using established general metrics (, ) and application-specific metrics. Results show that the proposed models can learn joint representations of image and text to generate relevant and descriptive captions for anatomies, such as the spine, the abdomen, the heart, and the head, in clinical fetal ultrasound scans.

摘要

我们描述了一种基于自动自然语言处理（NLP）的图像字幕方法，通过对超声医师和超声专家常用词汇进行建模来描述胎儿超声视频内容。生成的字幕在视觉内容和执行的扫描动作方面与超声医师描述扫描过程时所说的话相似。利用孕中期完整的胎儿超声视频以及伴随的专家旁白音频记录得出的文本，我们训练了由卷积神经网络和循环神经网络组成的深度学习模型，这些模型以合并配置为超声视频帧生成字幕。我们使用既定的通用指标（，）和特定于应用的指标来评估不同的模型架构。结果表明，所提出的模型可以学习图像和文本的联合表示，以便为临床胎儿超声扫描中的解剖结构（如脊柱、腹部、心脏和头部）生成相关且具有描述性的字幕。

相似文献

Captioning Ultrasound Images Automatically.

Med Image Comput Comput Assist Interv. 2019 Oct;22:338-346. doi: 10.1007/978-3-030-32251-9_37. Epub 2019 Oct 10.

Gaze-assisted automatic captioning of fetal ultrasound videos using three-way multi-modal deep neural networks.

Med Image Anal. 2022 Nov;82:102630. doi: 10.1016/j.media.2022.102630. Epub 2022 Sep 17.

Weakly Supervised Captioning of Ultrasound Images.

Med Image Underst Anal (2022). 2022 Jul;13413:187-198. doi: 10.1007/978-3-031-12053-4_14.

A Curriculum Learning Based Approach to Captioning Ultrasound Images.

Med Ultrasound Preterm Perinat Paediatr Image Anal (2020). 2020 Oct;12437:75-84. doi: 10.1007/978-3-030-60334-2_8. Epub 2020 Oct 1.

Clinical workflow of sonographers performing fetal anomaly ultrasound scans: deep-learning-based analysis.

Ultrasound Obstet Gynecol. 2022 Dec;60(6):759-765. doi: 10.1002/uog.24975.

Hybrid of Deep Learning and Word Embedding in Generating Captions: Image-Captioning Solution for Geological Rock Images.

J Imaging. 2022 Oct 22;8(11):294. doi: 10.3390/jimaging8110294.

Knowledge representation and learning of operator clinical workflow from full-length routine fetal ultrasound scan videos.

Med Image Anal. 2021 Apr;69:101973. doi: 10.1016/j.media.2021.101973. Epub 2021 Jan 23.

Chinese Image Caption Generation via Visual Attention and Topic Modeling.

IEEE Trans Cybern. 2022 Feb;52(2):1247-1257. doi: 10.1109/TCYB.2020.2997034. Epub 2022 Feb 16.

Image Captioning Based on Semantic Scenes.

Entropy (Basel). 2024 Oct 18;26(10):876. doi: 10.3390/e26100876.

SibNet: Sibling Convolutional Encoder for Video Captioning.

IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3259-3272. doi: 10.1109/TPAMI.2019.2940007. Epub 2021 Aug 4.

引用本文的文献

Weakly Supervised Captioning of Ultrasound Images.

Med Image Underst Anal (2022). 2022 Jul;13413:187-198. doi: 10.1007/978-3-031-12053-4_14.

Automatic captioning for medical imaging (MIC): a rapid review of literature.

Artif Intell Rev. 2023;56(5):4019-4076. doi: 10.1007/s10462-022-10270-w. Epub 2022 Sep 17.

The Use of Artificial Intelligence in Automation in the Fields of Gynaecology and Obstetrics - an Assessment of the State of Play.

Geburtshilfe Frauenheilkd. 2021 Nov 4;81(11):1203-1216. doi: 10.1055/a-1522-3029. eCollection 2021 Nov.

A Course-Focused Dual Curriculum For Image Captioning.

Proc IEEE Int Symp Biomed Imaging. 2021 Apr;2021:716-720. doi: 10.1109/ISBI48211.2021.9434055. Epub 2021 May 25.

Transforming obstetric ultrasound into data science using eye tracking, voice recording, transducer motion and ultrasound video.

Sci Rep. 2021 Jul 8;11(1):14109. doi: 10.1038/s41598-021-92829-1.

A Curriculum Learning Based Approach to Captioning Ultrasound Images.

Med Ultrasound Preterm Perinat Paediatr Image Anal (2020). 2020 Oct;12437:75-84. doi: 10.1007/978-3-030-60334-2_8. Epub 2020 Oct 1.

本文引用的文献

MTLD, vocd-D, and HD-D: a validation study of sophisticated approaches to lexical diversity assessment.

Behav Res Methods. 2010 May;42(2):381-92. doi: 10.3758/BRM.42.2.381.

Long short-term memory.

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

自动为超声图像添加字幕。

Captioning Ultrasound Images Automatically.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献