Section of Otolaryngology-Head and Neck Surgery, Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.
Department of Data Science and Analytics, Faculty of Science, University of Calgary, Calgary, Alberta, Canada.
Laryngoscope. 2023 Aug;133(8):1952-1960. doi: 10.1002/lary.30432. Epub 2022 Oct 13.
Diagnostic tools for voice disorders are lacking for primary care physicians. Artificial intelligence (AI) tools may add to the armamentarium for physicians, decreasing the time to diagnosis and limiting the burden of dysphonia.
Voice recordings of patients were collected from 2019 to 2021 using smartphones. The Saarbruecken dataset was included for comparison. Audio files were converted to mel-spectrograms using TensorFlow. Diagnostic categories were created to group pathology, including neurological and muscular disorders, inflammatory, mass lesions, and normal. The samples were further separated into sustained/a/and the rainbow passage.
Two hundred three prospective samples and 1131 samples were used from the Saarbruecken database. The AI detected abnormal pathology with an F1-score of 98%. The artificial neural network (ANN) differentiated key pathologies, including unilateral paralysis, laryngitis, adductor spasmodic dysphonia (ADSD), mass lesions, and normal samples with 39%-87% F-1 scores. The Calgary database models had higher F-1 scores in a head-to-head comparison to the Saarbruecken and combined datasets (87% vs. 58% and 50%). The AI outperformed otolaryngologists using a standardized test set of recordings (83% compared to 55% ± 15%).
An AI tool was created to differentiate pathology by individual or categorical diagnosis with high evaluation metrics. Prospective data should be collected in a controlled fashion to reduce intrinsic variability between recordings. Multi-center data collaborations are imperative to increase the prediction capability of AI tools for detecting vocal cord pathology. We provide proof-of-concept for an AI tool to assist primary care physicians in managing dysphonic patients.
3 Laryngoscope, 133:1952-1960, 2023.
初级保健医生缺乏用于诊断嗓音障碍的工具。人工智能(AI)工具可能会为医生提供更多的武器,减少诊断时间,并减轻嗓音障碍的负担。
使用智能手机从 2019 年至 2021 年收集患者的嗓音录音。同时纳入 Saarbruecken 数据集进行比较。使用 TensorFlow 将音频文件转换为梅尔频谱图。创建诊断类别以对病理进行分组,包括神经和肌肉疾病、炎症、肿块病变和正常。进一步将样本分为持续/a/和彩虹通道。
使用 Saarbruecken 数据库中的 203 个前瞻性样本和 1131 个样本。AI 检测异常病理的 F1 得分为 98%。人工神经网络(ANN)可区分关键病理,包括单侧麻痹、喉炎、内收肌痉挛性发音障碍(ADSD)、肿块病变和正常样本,F1 评分在 39%-87%之间。与 Saarbruecken 和合并数据集相比,Calgary 数据库模型在头对头比较中具有更高的 F1 评分(87%对 58%和 50%)。与耳鼻喉科医生使用标准化录音测试集相比,AI 表现更好(83%对 55%±15%)。
创建了一种 AI 工具,可通过个体或分类诊断来区分病理,具有较高的评估指标。应通过受控方式收集前瞻性数据,以减少录音之间的固有变异性。多中心数据合作对于提高 AI 工具检测声带病理的预测能力至关重要。我们提供了一个 AI 工具来协助初级保健医生管理声音障碍患者的概念验证。
3 Laryngoscope,133:1952-1960,2023。