Zhang Tianxiao, Bur Andrés M, Kraft Shannon, Kavookjian Hannah, Renslo Bryan, Chen Xiangyu, Luo Bo, Wang Guanghui
Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA.
Department of Otolaryngology-Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA.
J Imaging. 2023 May 29;9(6):109. doi: 10.3390/jimaging9060109.
Flexible laryngoscopy is commonly performed by otolaryngologists to detect laryngeal diseases and to recognize potentially malignant lesions. Recently, researchers have introduced machine learning techniques to facilitate automated diagnosis using laryngeal images and achieved promising results. The diagnostic performance can be improved when patients' demographic information is incorporated into models. However, the manual entry of patient data is time-consuming for clinicians. In this study, we made the first endeavor to employ deep learning models to predict patient demographic information to improve the detector model's performance. The overall accuracy for gender, smoking history, and age was 85.5%, 65.2%, and 75.9%, respectively. We also created a new laryngoscopic image set for the machine learning study and benchmarked the performance of eight classical deep learning models based on CNNs and Transformers. The results can be integrated into current learning models to improve their performance by incorporating the patient's demographic information.
柔性喉镜检查通常由耳鼻喉科医生进行,以检测喉部疾病并识别潜在的恶性病变。最近,研究人员引入了机器学习技术,以利用喉部图像促进自动诊断,并取得了可喜的成果。将患者的人口统计学信息纳入模型时,诊断性能可以得到提高。然而,临床医生手动输入患者数据很耗时。在本研究中,我们首次尝试使用深度学习模型来预测患者的人口统计学信息,以提高检测器模型的性能。性别、吸烟史和年龄的总体准确率分别为85.5%、65.2%和75.9%。我们还为机器学习研究创建了一个新的喉镜图像集,并对基于卷积神经网络(CNN)和Transformer的八个经典深度学习模型的性能进行了基准测试。通过纳入患者的人口统计学信息,这些结果可以整合到当前的学习模型中,以提高其性能。