Park Keon Vin, Oh Kyoung Ho, Jeong Yong Jun, Rhee Jihye, Han Mun Soo, Han Sung Won, Choi June
School of Industrial Management Engineering, Korea University, Seoul, Korea.
Department of Otorhinolaryngology-Head and Neck Surgery, Korea University Ansan Hospital, Korea University College of Medicine, Ansan, Korea.
Clin Exp Otorhinolaryngol. 2020 May;13(2):148-156. doi: 10.21053/ceo.2019.01858. Epub 2020 Mar 12.
Prognosticating idiopathic sudden sensorineural hearing loss (ISSNHL) is an important challenge. In our study, a dataset was split into training and test sets and cross-validation was implemented on the training set, thereby determining the hyperparameters for machine learning models with high test accuracy and low bias. The effectiveness of the following five machine learning models for predicting the hearing prognosis in patients with ISSNHL after 1 month of treatment was assessed: adaptive boosting, K-nearest neighbor, multilayer perceptron, random forest (RF), and support vector machine (SVM).
The medical records of 523 patients with ISSNHL admitted to Korea University Ansan Hospital between January 2010 and October 2017 were retrospectively reviewed. In this study, we analyzed data from 227 patients (recovery, 106; no recovery, 121) after excluding those with missing data. To determine risk factors, statistical hypothesis tests (e.g., the two-sample t-test for continuous variables and the chi-square test for categorical variables) were conducted to compare patients who did or did not recover. Variables were selected using an RF model depending on two criteria (mean decreases in the Gini index and accuracy).
The SVM model using selected predictors achieved both the highest accuracy (75.36%) and the highest F-score (0.74) on the test set. The RF model with selected variables demonstrated the second-highest accuracy (73.91%) and F-score (0.74). The RF model with the original variables showed the same accuracy (73.91%) as that of the RF model with selected variables, but a lower F-score (0.73). All the tested models, except RF, demonstrated better performance after variable selection based on RF.
The SVM model with selected predictors was the best-performing of the tested prediction models. The RF model with selected predictors was the second-best model. Therefore, machine learning models can be used to predict hearing recovery in patients with ISSNHL.
预测特发性突发性感音神经性听力损失(ISSNHL)是一项重要挑战。在我们的研究中,将一个数据集分为训练集和测试集,并在训练集上实施交叉验证,从而确定具有高测试准确性和低偏差的机器学习模型的超参数。评估了以下五种机器学习模型对ISSNHL患者治疗1个月后听力预后的预测效果:自适应增强、K近邻、多层感知器、随机森林(RF)和支持向量机(SVM)。
回顾性分析2010年1月至2017年10月在韩国大学安山医院收治的523例ISSNHL患者的病历。在本研究中,我们分析了227例患者的数据(恢复,106例;未恢复,121例),排除了数据缺失的患者。为了确定风险因素,进行了统计假设检验(例如,连续变量的两样本t检验和分类变量的卡方检验)以比较恢复和未恢复的患者。根据两个标准(基尼指数的平均下降和准确性)使用RF模型选择变量。
使用选定预测因子的SVM模型在测试集上实现了最高准确率(75.36%)和最高F值(0.74)。具有选定变量的RF模型表现出第二高的准确率(73.91%)和F值(0.74)。具有原始变量的RF模型显示出与具有选定变量的RF模型相同的准确率(73.91%),但F值较低(0.73)。除RF外,所有测试模型在基于RF进行变量选择后表现更好。
具有选定预测因子的SVM模型是测试预测模型中表现最佳的。具有选定预测因子的RF模型是第二好的模型。因此,机器学习模型可用于预测ISSNHL患者的听力恢复情况。