• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于内镜高速视频中声门分割的卷积神经网络再训练

Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos.

作者信息

Döllinger Michael, Schraut Tobias, Henrich Lea A, Chhetri Dinesh, Echternach Matthias, Johnson Aaron M, Kunduk Melda, Maryn Youri, Patel Rita R, Samlan Robin, Semmler Marion, Schützenberger Anne

机构信息

Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.

Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, CA 90095, USA.

出版信息

Appl Sci (Basel). 2022 Oct;12(19). doi: 10.3390/app12199791. Epub 2022 Sep 28.

DOI:10.3390/app12199791
PMID:37583544
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10427138/
Abstract

Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.

摘要

用于可视化和评估喉部声带动态的内镜高速视频(HSV)系统多种多样且技术不断进步。为了考虑基于神经网络(NN)的图像处理所导致的“概念转变”,对已经训练和使用过的神经网络进行重新训练是必要的,以便为新的记录方式进行足够准确的图像处理。我们提出并讨论了几种用于HSV图像分割的卷积神经网络(CNN)的重新训练方法。我们的基线CNN是在BAGLS数据集(58,750张图像)上训练的。新的BAGLS-RT数据集包含来自以前未使用的HSV系统、光源和不同空间分辨率的另外21,050张图像。结果表明,通过预处理增加数据多样性已经提高了分割精度(平均交并比提高6.35%)。随后的重新训练进一步提高了分割性能(平均交并比提高2.81%)。对于重新训练,使用动态知识蒸馏进行微调显示出最有希望的结果。用于训练和额外重新训练的数据多样性是提高HSV图像分割质量的有用工具。然而,在进行重新训练时,应牢记灾难性遗忘现象,即在适应新数据的同时忘记已经学到的知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/b9b726fe5272/nihms-1863866-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/ea8fd6edcc1b/nihms-1863866-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/dcd1854a31aa/nihms-1863866-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/3d79d1ff6da5/nihms-1863866-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/4fb8bc662a78/nihms-1863866-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/21eca559083b/nihms-1863866-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/93e4ffdb8e3a/nihms-1863866-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/ce0779c93dcd/nihms-1863866-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/91ff82ffd328/nihms-1863866-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/89c64fad306d/nihms-1863866-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/b9b726fe5272/nihms-1863866-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/ea8fd6edcc1b/nihms-1863866-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/dcd1854a31aa/nihms-1863866-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/3d79d1ff6da5/nihms-1863866-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/4fb8bc662a78/nihms-1863866-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/21eca559083b/nihms-1863866-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/93e4ffdb8e3a/nihms-1863866-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/ce0779c93dcd/nihms-1863866-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/91ff82ffd328/nihms-1863866-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/89c64fad306d/nihms-1863866-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0c3/10427138/b9b726fe5272/nihms-1863866-f0010.jpg

相似文献

1
Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos.用于内镜高速视频中声门分割的卷积神经网络再训练
Appl Sci (Basel). 2022 Oct;12(19). doi: 10.3390/app12199791. Epub 2022 Sep 28.
2
Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.使用深度卷积长短期记忆网络对喉内窥镜高速视频中的声门和声带进行全自动分割。
PLoS One. 2020 Feb 10;15(2):e0227791. doi: 10.1371/journal.pone.0227791. eCollection 2020.
3
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
4
Comparison of convolutional neural networks for classification of vocal fold nodules from high-speed video images.基于高速视频图像的声带小结分类卷积神经网络比较。
Eur Arch Otorhinolaryngol. 2023 May;280(5):2365-2371. doi: 10.1007/s00405-022-07736-6. Epub 2022 Nov 11.
5
Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。
J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.
6
Application of convolutional neural networks towards nuclei segmentation in localization-based super-resolution fluorescence microscopy images.基于定位的超分辨率荧光显微镜图像中核分割的卷积神经网络应用。
BMC Bioinformatics. 2021 Jun 15;22(1):325. doi: 10.1186/s12859-021-04245-x.
7
GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks.GlottisNetV2:基于深度卷积神经网络的时频声带中线检测
IEEE J Transl Eng Health Med. 2023 Jan 19;11:137-144. doi: 10.1109/JTEHM.2023.3237859. eCollection 2023.
8
Dual-stage semantic segmentation of endoscopic surgical instruments.内窥镜手术器械的双阶段语义分割
Med Phys. 2024 Dec;51(12):9125-9137. doi: 10.1002/mp.17397. Epub 2024 Sep 10.
9
BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.BAGLS,一个用于自动声门分割的多医院基准测试。
Sci Data. 2020 Jun 19;7(1):186. doi: 10.1038/s41597-020-0526-3.
10
A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis.深度学习增强型新型喉动力学分析软件工具。
J Speech Lang Hear Res. 2021 Jun 4;64(6):1889-1903. doi: 10.1044/2021_JSLHR-20-00498. Epub 2021 May 17.

引用本文的文献

1
Machine learning based assessment of hoarseness severity: a multi-sensor approach centered on high-speed videoendoscopy.基于机器学习的声音嘶哑严重程度评估:一种以高速视频内镜检查为核心的多传感器方法。
Front Artif Intell. 2025 Jun 5;8:1601716. doi: 10.3389/frai.2025.1601716. eCollection 2025.
2
Empirical Distribution of Glottal Edges (EDGE): A Statistical Assessment of Vocal Fold Kinematics Using High-Speed Videoendoscopy.声门边缘的经验分布(EDGE):使用高速视频内窥镜对声带运动学的统计评估。
IEEE J Biomed Health Inform. 2025 Feb;29(2):1087-1100. doi: 10.1109/JBHI.2024.3462632. Epub 2025 Feb 10.
3
New developments in the application of artificial intelligence to laryngology.

本文引用的文献

1
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
2
Applications of Artificial Intelligence to Office Laryngoscopy: A Scoping Review.人工智能在办公喉镜检查中的应用:范围综述。
Laryngoscope. 2022 Oct;132(10):1993-2016. doi: 10.1002/lary.29886. Epub 2021 Sep 28.
3
Text Data Augmentation for Deep Learning.
人工智能在喉科学中的应用新进展。
Curr Opin Otolaryngol Head Neck Surg. 2024 Dec 1;32(6):391-397. doi: 10.1097/MOO.0000000000000999. Epub 2024 Jul 24.
4
Deep Learning-Based Detection of Glottis Segmentation Failures.基于深度学习的声门分割失败检测
Bioengineering (Basel). 2024 Apr 30;11(5):443. doi: 10.3390/bioengineering11050443.
5
Reconstruction of Vocal Fold Medial Surface 3D Trajectories: Effects of Neuromuscular Stimulation and Airflow.声带内侧表面三维轨迹重建:神经肌肉刺激和气流的影响。
Laryngoscope. 2024 Mar;134(3):1249-1257. doi: 10.1002/lary.31029. Epub 2023 Sep 6.
用于深度学习的文本数据增强
J Big Data. 2021;8(1):101. doi: 10.1186/s40537-021-00492-0. Epub 2021 Jul 19.
4
OpenHSV: an open platform for laryngeal high-speed videoendoscopy.OpenHSV:用于喉高速视频内窥镜检查的开放平台。
Sci Rep. 2021 Jul 2;11(1):13760. doi: 10.1038/s41598-021-93149-0.
5
Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations.广义骰子重叠作为高度不平衡分割的深度学习损失函数
Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2017). 2017;2017:240-248. doi: 10.1007/978-3-319-67558-9_28. Epub 2017 Sep 9.
6
A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis.深度学习增强型新型喉动力学分析软件工具。
J Speech Lang Hear Res. 2021 Jun 4;64(6):1889-1903. doi: 10.1044/2021_JSLHR-20-00498. Epub 2021 May 17.
7
Impact of Subharmonic and Aperiodic Laryngeal Dynamics on the Phonatory Process Analyzed in Ex Vivo Rabbit Models.体外兔模型中分析的亚谐波和非周期性喉动力学对发声过程的影响
Appl Sci (Basel). 2019 May;9(9). doi: 10.3390/app9091963. Epub 2019 May 13.
8
Fluid-structure-acoustic interactions in an ex vivo porcine phonation model.离体猪发声模型中的流固声相互作用。
J Acoust Soc Am. 2021 Mar;149(3):1657. doi: 10.1121/10.0003602.
9
3D-FV-FE Aeroacoustic Larynx Model for Investigation of Functional Based Voice Disorders.用于研究基于功能的嗓音障碍的三维有限体积-有限元气动声学喉模型
Front Physiol. 2021 Mar 8;12:616985. doi: 10.3389/fphys.2021.616985. eCollection 2021.
10
A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.一种基于混合机器学习的方法用于在连贯语音期间对声带边缘进行解析表示。
Appl Sci (Basel). 2021 Feb;11(3). doi: 10.3390/app11031179. Epub 2021 Jan 27.