Redd Travis K, Prajna N Venkatesh, Srinivasan Muthiah, Lalitha Prajna, Krishnan Tiru, Rajaraman Revathi, Venugopal Anitha, Acharya Nisha, Seitzman Gerami D, Lietman Thomas M, Keenan Jeremy D, Campbell J Peter, Song Xubo
Casey Eye Institute, Oregon Health & Science University, Portland, Oregon.
Aravind Eye Hospital, Madurai, Tamil Nadu, India.
Ophthalmol Sci. 2022 Jan 29;2(2):100119. doi: 10.1016/j.xops.2022.100119. eCollection 2022 Jun.
Develop computer vision models for image-based differentiation of bacterial and fungal corneal ulcers and compare their performance against human experts.
Cross-sectional comparison of diagnostic performance.
Patients with acute, culture-proven bacterial or fungal keratitis from 4 centers in South India.
Five convolutional neural networks (CNNs) were trained using images from handheld cameras collected from patients with culture-proven corneal ulcers in South India recruited as part of clinical trials conducted between 2006 and 2015. Their performance was evaluated on 2 hold-out test sets (1 single center and 1 multicenter) from South India. Twelve local expert cornea specialists performed remote interpretation of the images in the multicenter test set to enable direct comparison against CNN performance.
Area under the receiver operating characteristic curve (AUC) individually and for each group collectively (i.e., CNN ensemble and human ensemble).
The best-performing CNN architecture was MobileNet, which attained an AUC of 0.86 on the single-center test set (other CNNs range, 0.68-0.84) and 0.83 on the multicenter test set (other CNNs range, 0.75-0.83). Expert human AUCs on the multicenter test set ranged from 0.42 to 0.79. The CNN ensemble achieved a statistically significantly higher AUC (0.84) than the human ensemble (0.76; < 0.01). CNNs showed relatively higher accuracy for fungal (81%) versus bacterial (75%) ulcers, whereas humans showed relatively higher accuracy for bacterial (88%) versus fungal (56%) ulcers. An ensemble of the best-performing CNN and best-performing human achieved the highest AUC of 0.87, although this was not statistically significantly higher than the best CNN (0.83; = 0.17) or best human (0.79; = 0.09).
Computer vision models achieved superhuman performance in identifying the underlying infectious cause of corneal ulcers compared with cornea specialists. The best-performing model, MobileNet, attained an AUC of 0.83 to 0.86 without any additional clinical or historical information. These findings suggest the potential for future implementation of these models to enable earlier directed antimicrobial therapy in the management of infectious keratitis, which may improve visual outcomes. Additional studies are ongoing to incorporate clinical history and expert opinion into predictive models.
开发用于基于图像区分细菌性和真菌性角膜溃疡的计算机视觉模型,并将其性能与人类专家进行比较。
诊断性能的横断面比较。
来自印度南部4个中心的经培养证实患有急性细菌性或真菌性角膜炎的患者。
使用2006年至2015年期间作为临床试验一部分招募的、来自印度南部经培养证实患有角膜溃疡的患者手持相机拍摄的图像,训练5个卷积神经网络(CNN)。在来自印度南部的2个保留测试集(1个单中心测试集和1个多中心测试集)上评估它们的性能。12位当地角膜专家对多中心测试集中的图像进行远程解读,以便与CNN的性能进行直接比较。
受试者工作特征曲线下面积(AUC),分别针对每个模型以及每个组(即CNN集合和人类集合)进行评估。
表现最佳的CNN架构是MobileNet,在单中心测试集上的AUC为0.86(其他CNN的范围为0.68 - 0.84),在多中心测试集上的AUC为0.83(其他CNN的范围为0.75 - 0.83)。多中心测试集上专家的AUC范围为0.42至0.79。CNN集合的AUC(0.84)在统计学上显著高于人类集合(0.76;P<0.01)。对于真菌性溃疡(81%),CNN显示出相对较高的准确率,而对于细菌性溃疡(75%)则较低;而人类对于细菌性溃疡(88%)显示出相对较高的准确率,对于真菌性溃疡(56%)则较低。表现最佳的CNN和表现最佳的人类组成的集合达到了最高的AUC为0.87,尽管这在统计学上并不显著高于最佳的CNN(0.83;P = 0.17)或最佳的人类(0.79;P = 0.09)。
与角膜专家相比,计算机视觉模型在识别角膜溃疡的潜在感染原因方面表现出超人的性能。表现最佳的模型MobileNet在没有任何额外临床或病史信息的情况下,AUC达到0.83至0.86。这些发现表明,未来实施这些模型有可能在感染性角膜炎的管理中实现更早的针对性抗菌治疗,这可能改善视觉预后。正在进行更多研究,将临床病史和专家意见纳入预测模型。