Chiang Michael F, Jiang Lei, Gelman Rony, Du Yunling E, Flynn John T
Department of Ophthalmology, Columbia University College of Physicians and Surgeons, New York, NY 10032, USA.
Arch Ophthalmol. 2007 Jul;125(7):875-80. doi: 10.1001/archopht.125.7.875.
To measure agreement of plus disease diagnosis among retinopathy of prematurity (ROP) experts.
A set of 34 wide-angle retinal photographs from infants with ROP was compiled on a secure Web site and was interpreted independently by 22 recognized ROP experts. Diagnostic agreement was analyzed using 3-level (plus, pre-plus, or neither) and 2-level (plus or not plus) categorizations.
In the 3-level categorization, all experts agreed on the same diagnosis in 4 of 34 images (12%), and the mean weighted kappa statistic for each expert compared with all others was between 0.21 and 0.40 (fair agreement) for 7 experts (32%) and between 0.41 and 0.60 (moderate agreement) for 15 experts (68%). In the 2-level categorization, all experts who provided a diagnosis agreed in 7 of 34 images (21%), and the mean kappa statistic for each expert compared with all others was between 0 and 0.20 (slight agreement) for 1 expert (5%), between 0.21 and 0.40 (fair agreement) for 3 experts (14%), between 0.41 and 0.60 (moderate agreement) for 12 experts (55%), and between 0.61 and 0.80 (substantial agreement) for 6 experts (27%).
Interexpert agreement of plus disease diagnosis is imperfect. This may have important implications for clinical ROP management, continued refinement of the international ROP classification system, development of computer-based diagnostic algorithms, and implementation of ROP telemedicine systems.
测量早产儿视网膜病变(ROP)专家之间对加病变诊断的一致性。
在一个安全的网站上收集了一组来自ROP婴儿的34张广角视网膜照片,并由22位公认的ROP专家独立解读。使用三级(加、预加或都不是)和二级(加或不加)分类对诊断一致性进行分析。
在三级分类中,34张图像中有4张(12%)所有专家的诊断相同,7位专家(32%)与其他所有专家相比,其平均加权kappa统计量在0.21至0.40之间(一致性一般),15位专家(68%)在0.41至0.60之间(中度一致)。在二级分类中,提供诊断的所有专家在34张图像中有7张(21%)意见一致,1位专家(5%)与其他所有专家相比,其平均kappa统计量在0至0.20之间(一致性轻微),3位专家(14%)在0.21至0.40之间(一致性一般),12位专家(55%)在0.41至0.60之间(中度一致),6位专家(27%)在0.61至0.80之间(高度一致)。
专家之间对加病变诊断的一致性并不理想。这可能对ROP的临床管理、国际ROP分类系统的持续完善、基于计算机的诊断算法的开发以及ROP远程医疗系统的实施具有重要意义。