Döllinger Michael, Schraut Tobias, Henrich Lea A, Chhetri Dinesh, Echternach Matthias, Johnson Aaron M, Kunduk Melda, Maryn Youri, Patel Rita R, Samlan Robin, Semmler Marion, Schützenberger Anne
Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.
Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, CA 90095, USA.
Appl Sci (Basel). 2022 Oct;12(19). doi: 10.3390/app12199791. Epub 2022 Sep 28.
Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.
用于可视化和评估喉部声带动态的内镜高速视频(HSV)系统多种多样且技术不断进步。为了考虑基于神经网络(NN)的图像处理所导致的“概念转变”,对已经训练和使用过的神经网络进行重新训练是必要的,以便为新的记录方式进行足够准确的图像处理。我们提出并讨论了几种用于HSV图像分割的卷积神经网络(CNN)的重新训练方法。我们的基线CNN是在BAGLS数据集(58,750张图像)上训练的。新的BAGLS-RT数据集包含来自以前未使用的HSV系统、光源和不同空间分辨率的另外21,050张图像。结果表明,通过预处理增加数据多样性已经提高了分割精度(平均交并比提高6.35%)。随后的重新训练进一步提高了分割性能(平均交并比提高2.81%)。对于重新训练,使用动态知识蒸馏进行微调显示出最有希望的结果。用于训练和额外重新训练的数据多样性是提高HSV图像分割质量的有用工具。然而,在进行重新训练时,应牢记灾难性遗忘现象,即在适应新数据的同时忘记已经学到的知识。