Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, USA.
Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, USA.
Otolaryngol Head Neck Surg. 2023 Jul;169(1):41-46. doi: 10.1177/01945998221119156. Epub 2023 Jan 27.
We compared the diagnostic performance of human clinicians with that of a neural network algorithm developed using a library of tympanic membrane images derived from children taken to the operating room with the intent of performing myringotomy and possible tube placement for recurrent acute otitis media (AOM) or otitis media with effusion (OME).
Retrospective cohort study.
Tertiary academic medical center from 2018 to 2021.
A training set of 639 images of tympanic membranes representing normal, OME, and AOM was used to train a neural network as well as a proprietary commercial image classifier from Google. Model diagnostic prediction performance in differentiating normal vs nonpurulent vs purulent effusion was scored based on classification accuracy. A web-based survey was developed to test human clinicians' diagnostic accuracy on a novel image set, and this was compared head to head against our model.
Our model achieved a mean prediction accuracy of 80.8% (95% CI, 77.0%-84.6%). The Google model achieved a prediction accuracy of 85.4%. In a validation survey of 39 clinicians analyzing a sample of 22 endoscopic ear images, the average diagnostic accuracy was 65.0%. On the same data set, our model achieved an accuracy of 95.5%.
Our model outperformed certain groups of human clinicians in assessing images of tympanic membranes for effusions in children. Reduced diagnostic error rates using machine learning models may have implications in reducing rates of misdiagnosis, potentially leading to fewer missed diagnoses, unnecessary antibiotic prescriptions, and surgical procedures.
我们比较了人类临床医生的诊断性能与使用从手术室接受鼓膜切开术和可能放置管的儿童的鼓膜图像库开发的神经网络算法的诊断性能,这些儿童患有复发性急性中耳炎(AOM)或分泌性中耳炎(OME)。
回顾性队列研究。
2018 年至 2021 年的三级学术医疗中心。
使用代表正常、OME 和 AOM 的 639 张鼓膜图像的训练集来训练神经网络以及来自谷歌的专有商业图像分类器。根据分类准确性对模型在区分正常与非脓性与脓性积液方面的诊断预测性能进行评分。开发了一个基于网络的调查来测试人类临床医生在新颖图像集上的诊断准确性,并与我们的模型进行直接比较。
我们的模型平均预测准确率为 80.8%(95%CI,77.0%-84.6%)。谷歌模型的预测准确率为 85.4%。在对 22 个内窥镜耳图像样本进行分析的 39 名临床医生的验证调查中,平均诊断准确率为 65.0%。在相同的数据集中,我们的模型准确率达到 95.5%。
在评估儿童鼓膜炎积液的图像方面,我们的模型优于某些组别的人类临床医生。使用机器学习模型降低诊断错误率可能会减少误诊率,从而减少漏诊、不必要的抗生素处方和手术的可能性。