Vemulapalli Kalyan, Gandhewar Rishikesh, Safitri Atika, Ng Sueko M, Michelessi Manuele, Azuara-Blanco Augusto, Liu Su-Hsun, Virgili Gianni, Hu Kuang
Glaucoma Research Fellow, Moorfields Eye Hospital, London, UK.
Department of Ophthalmology, Royal Berkshire Hospital, Reading, UK.
Cochrane Database Syst Rev. 2025 Jun 17;6(6):CD016114. doi: 10.1002/14651858.CD016114.
This is a protocol for a Cochrane Review (diagnostic). The objectives are as follows: To determine the accuracy of artificial intelligence (AI) algorithms as a diagnostic tool for glaucoma compared with human graders in a community or secondary care setting. Secondary objectives To compare the performance of different AI algorithms in the diagnosis of glaucoma To explore other potential causes of heterogeneity in the diagnostic performance compared with human graders, using subgroup analysis of the following characteristics: Clinical setting in which the test is used (the general population in the community versus people referred to secondary care); Study design (studies enroling consecutive participants in the same setting versus multicentre registries and public databases); Characteristics of the population (age according to quartiles, sex, symptomatic versus asymptomatic), when sufficient data are available; Prevalence of glaucoma in the training and test sets (< 5% versus ≥ 5%), since sensitivity and specificity may depend on disease prevalence [17]; Severity of glaucoma; Core AI method used (neural networks, random forests, support vector machines, or others); Size of the dataset from which performance data were collected (< 1000 versus ≥ 1000 total unique participants); Modalities of input data for the AI algorithms (imaging data, visual fields, clinical parameters, or any combination), particularly as their accessibility and affordability may vary across different settings.
这是一项Cochrane系统评价(诊断性)的方案。目标如下:确定在社区或二级医疗机构中,与人工分级者相比,人工智能(AI)算法作为青光眼诊断工具的准确性。次要目标:比较不同AI算法在青光眼诊断中的性能;通过对以下特征进行亚组分析,探索与人工分级者相比,诊断性能异质性的其他潜在原因:使用测试的临床环境(社区中的普通人群与转诊至二级医疗机构的人群);研究设计(在同一环境中纳入连续参与者的研究与多中心登记处和公共数据库);人群特征(根据四分位数划分的年龄、性别、有症状与无症状),前提是有足够的数据;训练集和测试集中青光眼的患病率(<5%与≥5%),因为敏感性和特异性可能取决于疾病患病率[17];青光眼的严重程度;使用的核心AI方法(神经网络、随机森林、支持向量机或其他);收集性能数据的数据集大小(<1000与≥1000名总独特参与者);AI算法的输入数据模式(成像数据、视野、临床参数或任何组合),特别是因为它们在不同环境中的可及性和可负担性可能不同。