Department of Unified Health Record, Lean for Business Services, Riyadh, Saudi Arabia.
Leeds Institute of Health Sciences, School of Medicine, University of Leads, Leeds, United Kingdom.
J Med Internet Res. 2022 Oct 14;24(10):e38472. doi: 10.2196/38472.
When investigating voice disorders a series of processes are used when including voice screening and diagnosis. Both methods have limited standardized tests, which are affected by the clinician's experience and subjective judgment. Machine learning (ML) algorithms have been used as an objective tool in screening or diagnosing voice disorders. However, the effectiveness of ML algorithms in assessing and diagnosing voice disorders has not received sufficient scholarly attention.
This systematic review aimed to assess the effectiveness of ML algorithms in screening and diagnosing voice disorders.
An electronic search was conducted in 5 databases. Studies that examined the performance (accuracy, sensitivity, and specificity) of any ML algorithm in detecting pathological voice samples were included. Two reviewers independently selected the studies, extracted data from the included studies, and assessed the risk of bias. The methodological quality of each study was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 tool via RevMan 5 software (Cochrane Library). The characteristics of studies, population, and index tests were extracted, and meta-analyses were conducted to pool the accuracy, sensitivity, and specificity of ML techniques. The issue of heterogeneity was addressed by discussing possible sources and excluding studies when necessary.
Of the 1409 records retrieved, 13 studies and 4079 participants were included in this review. A total of 13 ML techniques were used in the included studies, with the most common technique being least squares support vector machine. The pooled accuracy, sensitivity, and specificity of ML techniques in screening voice disorders were 93%, 96%, and 93%, respectively. Least squares support vector machine had the highest accuracy (99%), while the K-nearest neighbor algorithm had the highest sensitivity (98%) and specificity (98%). Quadric discriminant analysis achieved the lowest accuracy (91%), sensitivity (89%), and specificity (89%).
ML showed promising findings in the screening of voice disorders. However, the findings were not conclusive in diagnosing voice disorders owing to the limited number of studies that used ML for diagnostic purposes; thus, more investigations are needed. While it might not be possible to use ML alone as a substitute for current diagnostic tools, it may be used as a decision support tool for clinicians to assess their patients, which could improve the management process for assessment.
PROSPERO CRD42020214438; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=214438.
在研究嗓音障碍时,会采用一系列流程,包括嗓音筛查和诊断。这两种方法的标准化测试都很有限,且容易受到临床医生经验和主观判断的影响。机器学习(ML)算法已被用作筛查或诊断嗓音障碍的客观工具。然而,ML 算法在评估和诊断嗓音障碍方面的有效性尚未得到充分的学术关注。
本系统评价旨在评估 ML 算法在嗓音障碍筛查和诊断中的有效性。
对 5 个数据库进行电子检索。纳入研究的 ML 算法用于检测病理性嗓音样本的性能(准确性、敏感度和特异性)。两名评审员独立选择研究、提取纳入研究的数据,并评估偏倚风险。使用 RevMan 5 软件(Cochrane 图书馆)中的诊断准确性研究质量评估工具 2 对每项研究的方法学质量进行评估。提取研究的特征、人群和指标检验,并进行荟萃分析以汇总 ML 技术的准确性、敏感度和特异性。通过讨论可能的来源和必要时排除研究来解决异质性问题。
在检索到的 1409 条记录中,共有 13 项研究和 4079 名参与者纳入本评价。纳入研究共使用了 13 种 ML 技术,最常见的技术是最小二乘支持向量机。ML 技术筛查嗓音障碍的汇总准确性、敏感度和特异性分别为 93%、96%和 93%。最小二乘支持向量机的准确性最高(99%),而 K 最近邻算法的敏感度(98%)和特异性(98%)最高。二次判别分析的准确性最低(91%)、敏感度(89%)和特异性(89%)最低。
ML 在嗓音障碍筛查方面显示出有前景的结果。然而,由于用于诊断目的的 ML 研究数量有限,因此在诊断嗓音障碍方面的结论尚不确定,因此需要进一步的研究。虽然 ML 可能无法单独用作当前诊断工具的替代品,但它可以用作临床医生评估患者的决策支持工具,从而改善评估的管理流程。
PROSPERO CRD42020214438;https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=214438。