使用深度卷积神经网络的迁移学习在未修剪视频荧光吞咽研究中进行自动咽期识别

Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks.

作者信息

Lee Ki-Sun, Lee Eunyoung, Choi Bareun, Pyun Sung-Bom

机构信息

Medical Science Research Center, Ansan Hospital, Korea University College of Medicine, Ansan-si 15355, Korea.

Department of Physical Medicine and Rehabilitation, Anam Hospital, Korea University College of Medicine, Seoul 02841, Korea.

出版信息

Diagnostics (Basel). 2021 Feb 13;11(2):300. doi: 10.3390/diagnostics11020300.

DOI:10.3390/diagnostics11020300

PMID:33668528

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7918932/

Abstract

BACKGROUND

Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually.

METHODS

To determine whether the image frame in the VFSS video is in the pharyngeal phase, a single-frame baseline architecture based the deep CNN framework is used and a transfer learning technique with fine-tuning is applied.

RESULTS

Compared with all experimental CNN models, that fine-tuned with two blocks of the VGG-16 (VGG16-FT5) model achieved the highest performance in terms of recognizing the frame of pharyngeal phase, that is, the accuracy of 93.20 (±1.25)%, sensitivity of 84.57 (±5.19)%, specificity of 94.36 (±1.21)%, AUC of 0.8947 (±0.0269) and Kappa of 0.7093 (±0.0488).

CONCLUSIONS

Using appropriate and fine-tuning techniques and explainable deep learning techniques such as grad CAM, this study shows that the proposed single-frame-baseline-architecture-based deep CNN framework can yield high performances in the full automation of VFSS video analysis.

摘要

背景

视频荧光吞咽造影研究（VFSS）被认为是评估吞咽困难的金标准诊断工具。然而，临床医生逐帧手动搜索记录的长视频图像以识别VFSS图像中的瞬时吞咽异常既耗时又费力。因此，本研究旨在提出一种基于深度学习的方法，使用卷积神经网络（CNN）进行迁移学习，自动标注未剪辑的VFSS视频中的咽期帧，从而无需手动搜索帧。

方法

为了确定VFSS视频中的图像帧是否处于咽期，使用了基于深度CNN框架的单帧基线架构，并应用了带有微调的迁移学习技术。

结果

与所有实验性CNN模型相比，用VGG-16的两个模块进行微调的模型（VGG16-FT5）在识别咽期帧方面表现最佳，即准确率为93.20（±1.25）%，灵敏度为84.57（±5.19）%，特异性为94.36（±1.21）%，曲线下面积（AUC）为0.8947（±0.0269），kappa值为0.7093（±0.0488）。

结论

本研究表明，使用适当的微调技术和诸如梯度加权类激活映射（grad CAM）等可解释的深度学习技术，所提出的基于单帧基线架构的深度CNN框架在VFSS视频分析的完全自动化方面可产生高性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8717/7918932/7f51d93546a1/diagnostics-11-00300-g001.jpg

相似文献

Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks.

Diagnostics (Basel). 2021 Feb 13;11(2):300. doi: 10.3390/diagnostics11020300.

Automatic Detection of the Pharyngeal Phase in Raw Videos for the Videofluoroscopic Swallowing Study Using Efficient Data Collection and 3D Convolutional Networks .

Sensors (Basel). 2019 Sep 7;19(18):3873. doi: 10.3390/s19183873.

Deep Learning Analysis to Automatically Detect the Presence of Penetration or Aspiration in Videofluoroscopic Swallowing Study.

J Korean Med Sci. 2022 Feb 14;37(6):e42. doi: 10.3346/jkms.2022.37.e42.

Automated pharyngeal phase detection and bolus localization in videofluoroscopic swallowing study: Killing two birds with one stone?

Comput Methods Programs Biomed. 2022 Oct;225:107058. doi: 10.1016/j.cmpb.2022.107058. Epub 2022 Aug 4.

The effect of time on the automated detection of the pharyngeal phase in videofluoroscopic swallowing studies.

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:3435-3438. doi: 10.1109/EMBC46164.2021.9629562.

Evaluation of Scalability and Degree of Fine-Tuning of Deep Convolutional Neural Networks for COVID-19 Screening on Chest X-ray Images Using Explainable Deep-Learning Algorithm.

J Pers Med. 2020 Nov 7;10(4):213. doi: 10.3390/jpm10040213.

Detection of aspiration from images of a videofluoroscopic swallowing study adopting deep learning.

Oral Radiol. 2023 Jul;39(3):553-562. doi: 10.1007/s11282-023-00669-8. Epub 2023 Feb 8.

Machine learning analysis to automatically measure response time of pharyngeal swallowing reflex in videofluoroscopic swallowing study.

Sci Rep. 2020 Sep 7;10(1):14735. doi: 10.1038/s41598-020-71713-4.

Application of deep learning technology for temporal analysis of videofluoroscopic swallowing studies.

Sci Rep. 2023 Oct 16;13(1):17522. doi: 10.1038/s41598-023-44802-3.

Automatic Tracking of Hyoid Bone Displacement and Rotation Relative to Cervical Vertebrae in Videofluoroscopic Swallow Studies Using Deep Learning.

J Imaging Inform Med. 2024 Aug;37(4):1922-1932. doi: 10.1007/s10278-024-01039-4. Epub 2024 Feb 21.

引用本文的文献

Artificial intelligence in the diagnosis and management of dysphagia: a scoping review.

Codas. 2025 Aug 8;37(4):e20240305. doi: 10.1590/2317-1782/e20240305en. eCollection 2025.

Artificial Intelligence in Videofluoroscopy Swallow Study Analysis: A Comprehensive Review.

Dysphagia. 2025 Feb 17. doi: 10.1007/s00455-025-10812-8.

Application of deep learning technology for temporal analysis of videofluoroscopic swallowing studies.

Sci Rep. 2023 Oct 16;13(1):17522. doi: 10.1038/s41598-023-44802-3.

Explainability of deep learning models in medical video analysis: a survey.

PeerJ Comput Sci. 2023 Mar 14;9:e1253. doi: 10.7717/peerj-cs.1253. eCollection 2023.

Automated pharyngeal phase detection and bolus localization in videofluoroscopic swallowing study: Killing two birds with one stone?

Comput Methods Programs Biomed. 2022 Oct;225:107058. doi: 10.1016/j.cmpb.2022.107058. Epub 2022 Aug 4.

The effect of time on the automated detection of the pharyngeal phase in videofluoroscopic swallowing studies.

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:3435-3438. doi: 10.1109/EMBC46164.2021.9629562.

Hyoid Bone Tracking in a Videofluoroscopic Swallowing Study Using a Deep-Learning-Based Segmentation Network.

Diagnostics (Basel). 2021 Jun 23;11(7):1147. doi: 10.3390/diagnostics11071147.

本文引用的文献

Evaluation of Scalability and Degree of Fine-Tuning of Deep Convolutional Neural Networks for COVID-19 Screening on Chest X-ray Images Using Explainable Deep-Learning Algorithm.

J Pers Med. 2020 Nov 7;10(4):213. doi: 10.3390/jpm10040213.

Machine learning analysis to automatically measure response time of pharyngeal swallowing reflex in videofluoroscopic swallowing study.

Sci Rep. 2020 Sep 7;10(1):14735. doi: 10.1038/s41598-020-71713-4.

Evaluation of Transfer Learning with Deep Convolutional Neural Networks for Screening Osteoporosis in Dental Panoramic Radiographs.

J Clin Med. 2020 Feb 1;9(2):392. doi: 10.3390/jcm9020392.

Exploring Large-scale Public Medical Image Datasets.

Acad Radiol. 2020 Jan;27(1):106-112. doi: 10.1016/j.acra.2019.10.006. Epub 2019 Nov 6.

Automatic Detection of the Pharyngeal Phase in Raw Videos for the Videofluoroscopic Swallowing Study Using Efficient Data Collection and 3D Convolutional Networks .

Sensors (Basel). 2019 Sep 7;19(18):3873. doi: 10.3390/s19183873.

Automatic hyoid bone detection in fluoroscopic images using deep learning.

Sci Rep. 2018 Aug 17;8(1):12310. doi: 10.1038/s41598-018-30182-6.

Comparison of Transferred Deep Neural Networks in Ultrasonic Breast Masses Discrimination.

Biomed Res Int. 2018 Jun 21;2018:4605191. doi: 10.1155/2018/4605191. eCollection 2018.

A survey on deep learning in medical image analysis.

Med Image Anal. 2017 Dec;42:60-88. doi: 10.1016/j.media.2017.07.005. Epub 2017 Jul 26.

II. MORE THAN JUST CONVENIENT: THE SCIENTIFIC MERITS OF HOMOGENEOUS CONVENIENCE SAMPLES.

Monogr Soc Res Child Dev. 2017 Jun;82(2):13-30. doi: 10.1111/mono.12296.

Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?

IEEE Trans Med Imaging. 2016 May;35(5):1299-1312. doi: 10.1109/TMI.2016.2535302. Epub 2016 Mar 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用深度卷积神经网络的迁移学习在未修剪视频荧光吞咽研究中进行自动咽期识别

Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献