Nam Kihwan, Lee Changyeol, Lee Taeheon, Shin Munseop, Kim Bo Hae, Park Jin-Woo
Graduate School of Management of Technology, Korea University, Seoul 02841, Republic of Korea.
AimedAI, Seoul 06150, Republic of Korea.
Diagnostics (Basel). 2024 Jul 6;14(13):1444. doi: 10.3390/diagnostics14131444.
We aimed to develop an automated detector that determines laryngeal invasion during swallowing. Laryngeal invasion, which causes significant clinical problems, is defined as two or more points on the penetration-aspiration scale (PAS). We applied two three-dimensional (3D) stream networks for action recognition in videofluoroscopic swallowing study (VFSS) videos. To detect laryngeal invasion (PAS 2 or higher scores) in VFSS videos, we employed two 3D stream networks for action recognition. To establish the robustness of our model, we compared its performance with those of various current image classification-based architectures. The proposed model achieved an accuracy of 92.10%. Precision, recall, and F1 scores for detecting laryngeal invasion (≥PAS 2) in VFSS videos were 0.9470 each. The accuracy of our model in identifying laryngeal invasion surpassed that of other updated image classification models (60.58% for ResNet101, 60.19% for Swin-Transformer, 63.33% for EfficientNet-B2, and 31.17% for HRNet-W32). Our model is the first automated detector of laryngeal invasion in VFSS videos based on video action recognition networks. Considering its high and balanced performance, it may serve as an effective screening tool before clinicians review VFSS videos, ultimately reducing the burden on clinicians.
我们旨在开发一种能在吞咽过程中确定喉侵犯情况的自动检测装置。喉侵犯会引发严重的临床问题,在渗透 - 误吸量表(PAS)上被定义为两个或更多的点。我们在视频荧光吞咽造影研究(VFSS)视频中应用了两个三维(3D)流网络进行动作识别。为了在VFSS视频中检测喉侵犯(PAS 2分或更高分数),我们采用了两个用于动作识别的3D流网络。为了确立我们模型的稳健性,我们将其性能与当前各种基于图像分类的架构的性能进行了比较。所提出的模型准确率达到了92.10%。在VFSS视频中检测喉侵犯(≥PAS 2)的精确率、召回率和F1分数均为0.9470。我们的模型在识别喉侵犯方面的准确率超过了其他更新的图像分类模型(ResNet101为60.58%,Swin-Transformer为60.19%,EfficientNet-B2为63.33%,HRNet-W32为31.17%)。我们的模型是首个基于视频动作识别网络的VFSS视频中喉侵犯自动检测装置。鉴于其高且均衡的性能,它可作为临床医生查看VFSS视频前的有效筛查工具,最终减轻临床医生的负担。