Mo Jonathan T, Chong Davis S, Sun Cynthia, Mohapatra Nikita, Jiam Nicole T
University of California, Davis School of Medicine, Sacramento, California, USA.
San Francisco Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, California, USA.
Ear Hear. 2025;46(4):952-962. doi: 10.1097/AUD.0000000000001638. Epub 2025 Jan 29.
Cochlear implant (CI) user functional outcomes are challenging to predict because of the variability in individual anatomy, neural health, CI device characteristics, and linguistic and listening experience. Machine learning (ML) techniques are uniquely poised for this predictive challenge because they can analyze nonlinear interactions using large amounts of multidimensional data. The objective of this article is to systematically review the literature regarding ML models that predict functional CI outcomes, defined as sound perception and production. We analyze the potential strengths and weaknesses of various ML models, identify important features for favorable outcomes, and suggest potential future directions of ML applications for CI-related clinical and research purposes.
We conducted a systematic literature search with Web of Science, Scopus, MEDLINE, EMBASE, CENTRAL, and CINAHL from the date of inception through September 2024. We included studies with ML models predicting a CI functional outcome, defined as those pertaining to sound perception and production, and excluded simulation studies and those involving patients without CIs. Using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, we extracted participant population, CI characteristics, ML model, and performance data. Sixteen studies examining 5058 pediatric and adult CI users (range: 4 to 2489) were included from an initial 1442 publications.
Studies predicted heterogeneous outcome measures pertaining to sound production (5 studies), sound perception (12 studies), and language (2 studies). ML models use a variety of prediction features, including demographic, audiological, imaging, and subjective measures. Some studies highlighted predictors beyond traditional CI audiometric outcomes, such as anatomical and imaging characteristics (e.g., vestibulocochlear nerve area, brain regions unaffected by auditory deprivation), health system factors (e.g., wait time to referral), and patient-reported measures (e.g., dizziness and tinnitus questionnaires). Used ML models were tree-based, kernel-based, instance-based, probabilistic, or neural networks, with validation and test methods most commonly being k-fold cross-validation and train-test split. Various statistical measures were used to evaluate model performance, however, for studies reporting accuracy, the best-performing models for each study ranged from 71.0% to 98.83%.
ML models demonstrate high predictive performance and illuminate factors that contribute to CI user functional outcomes. While many models showed favorable evaluation statistics, the majority were not adequately reported with regard to dataset characteristics, model creation, and validation. Furthermore, the extent of overfitting in these models is unclear and will likely result in poor generalization to new data. This suggests the need for more robust validation procedures and standardization in reporting, with the ultimate hope that the iterative improvement of these models will allow for their adoption as a future clinical tool.
由于个体解剖结构、神经健康状况、人工耳蜗(CI)设备特性以及语言和听力经验存在差异,预测人工耳蜗使用者的功能结果具有挑战性。机器学习(ML)技术特别适合应对这一预测挑战,因为它们可以使用大量多维数据来分析非线性相互作用。本文的目的是系统回顾关于预测人工耳蜗功能结果(定义为声音感知和发声)的机器学习模型的文献。我们分析了各种机器学习模型的潜在优势和劣势,确定了有利于良好结果的重要特征,并提出了机器学习在人工耳蜗相关临床和研究中的潜在未来应用方向。
我们使用Web of Science、Scopus、MEDLINE、EMBASE、CENTRAL和CINAHL进行了一项系统的文献检索,检索时间从各数据库创建之日至2024年9月。我们纳入了使用机器学习模型预测人工耳蜗功能结果(定义为与声音感知和发声相关的结果)的研究,排除了模拟研究以及涉及非人工耳蜗使用者的研究。按照系统评价和Meta分析的首选报告项目指南,我们提取了参与者群体、人工耳蜗特征、机器学习模型和性能数据。从最初的1442篇出版物中纳入了16项研究,共涉及5058名儿童和成人人工耳蜗使用者(范围:4至2489名)。
研究预测了与发声(5项研究)、声音感知(12项研究)和语言(2项研究)相关的异质结果指标。机器学习模型使用了多种预测特征,包括人口统计学、听力学、影像学和主观测量指标。一些研究强调了传统人工耳蜗听力测量结果之外的预测因素,如解剖和影像学特征(如前庭蜗神经区域、未受听觉剥夺影响的脑区)、卫生系统因素(如转诊等待时间)以及患者报告的测量指标(如头晕和耳鸣问卷)。所使用的机器学习模型包括基于树的、基于核的、基于实例的、概率性的或神经网络模型,验证和测试方法最常用的是k折交叉验证和训练 - 测试分割。使用了各种统计指标来评估模型性能,然而,对于报告准确率的研究,每项研究中表现最佳的模型准确率范围为71.0%至98.83%。
机器学习模型显示出较高的预测性能,并揭示了有助于人工耳蜗使用者功能结果的因素。虽然许多模型显示出良好的评估统计数据,但大多数在数据集特征、模型创建和验证方面的报告并不充分。此外,这些模型的过拟合程度尚不清楚,可能会导致对新数据的泛化能力较差。这表明需要更稳健的验证程序和报告标准化,最终希望这些模型的迭代改进能够使其作为未来的临床工具被采用。