Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland.
Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland.
Sensors (Basel). 2022 Feb 23;22(5):1751. doi: 10.3390/s22051751.
Laryngeal high-speed videoendoscopy (LHSV) is an imaging technique offering novel visualization quality of the vibratory activity of the vocal folds. However, in most image analysis methods, the interaction of the medical personnel and access to ground truth annotations are required to achieve accurate detection of vocal folds edges. In our fully automatic method, we combine video and acoustic data that are synchronously recorded during the laryngeal endoscopy. We show that the image segmentation algorithm of the glottal area can be optimized by matching the Fourier spectra of the pre-processed video and the spectra of the acoustic recording during the phonation of sustained vowel /i:/. We verify our method on a set of LHSV recordings taken from subjects with normophonic voice and patients with voice disorders due to glottal insufficiency. We show that the computed geometric indices of the glottal area make it possible to discriminate between normal and pathologic voices. The median of the Open Quotient and Minimal Relative Glottal Area values for healthy subjects were 0.69 and 0.06, respectively, while for dysphonic subjects were 1 and 0.35, respectively. We also validate these results using independent phoniatrician experts.
喉高速视频内窥镜(LHSV)是一种成像技术,提供了声带振动活动的新颖可视化质量。然而,在大多数图像分析方法中,需要医疗人员的交互作用和对地面真实注释的访问,以实现声带边缘的准确检测。在我们的全自动方法中,我们结合了在喉镜检查期间同步记录的视频和声学数据。我们表明,可以通过将预处理视频的傅里叶谱与发声期间的声学记录的频谱相匹配,来优化声门区域的图像分割算法。我们在一组来自具有正常发音声音的受试者和由于声门不足而患有声音障碍的患者的 LHSV 记录上验证了我们的方法。我们表明,计算的声门区域的几何指数可以区分正常和病理声音。健康受试者的开口比和最小相对声门面积值的中位数分别为 0.69 和 0.06,而发音障碍受试者的中位数分别为 1 和 0.35。我们还使用独立的语音科医生专家来验证这些结果。