Moleta Chace, Campbell J Peter, Kalpathy-Cramer Jayashree, Chan R V Paul, Ostmo Susan, Jonas Karyn, Chiang Michael F
Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon.
Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts.
Am J Ophthalmol. 2017 Apr;176:70-76. doi: 10.1016/j.ajo.2016.12.025. Epub 2017 Jan 11.
To identify any temporal trends in the diagnosis of plus disease in retinopathy of prematurity (ROP) by experts.
Reliability analysis.
ROP experts were recruited in 2007 and 2016 to classify 34 wide-field fundus images of ROP as plus, pre-plus, or normal, coded as "3," "2," and "1," respectively, in the database. The main outcome was the average calculated score for each image in each cohort. Secondary outcomes included correlation on the relative ordering of the images in 2016 vs 2007, interexpert agreement, and intraexpert agreement.
The average score for each image was higher for 30 of 34 (88%) images in 2016 compared with 2007, influenced by fewer images classified as normal (P < .01), a similar number of pre-plus (P = .52), and more classified as plus (P < .01). The mean weighted kappa values in 2006 were 0.36 (range 0.21-0.60), compared with 0.22 (range 0-0.40) in 2016. There was good correlation between rankings of disease severity between the 2 cohorts (Spearman rank correlation ρ = 0.94), indicating near-perfect agreement on relative disease severity.
Despite good agreement between cohorts on relative disease severity ranking, the higher average score and classifications for each image demonstrate that experts are diagnosing pre-plus and plus disease at earlier stages of disease severity in 2016, compared with 2007. This has implications for patient care, research, and teaching, and additional studies are needed to better understand this temporal trend in image-based plus disease diagnosis.
确定专家对早产儿视网膜病变(ROP)中附加病变诊断的任何时间趋势。
可靠性分析。
2007年和2016年招募了ROP专家,将34张ROP的广角眼底图像分类为附加病变、附加病变前期或正常,在数据库中分别编码为“3”“2”和“1”。主要结果是每个队列中每张图像的平均计算得分。次要结果包括2016年与2007年图像相对排序的相关性、专家间一致性和专家内一致性。
与2007年相比,2016年34张图像中有30张(88%)每张图像的平均得分更高,这受到分类为正常的图像减少(P <.01)、附加病变前期图像数量相似(P =.52)以及分类为附加病变的图像增多(P <.01)的影响。2006年的平均加权kappa值为0.36(范围0.21 - 0.60),而2016年为0.22(范围0 - 0.40)。两个队列之间疾病严重程度排名有良好的相关性(Spearman等级相关性ρ = 0.94),表明在相对疾病严重程度上几乎完全一致。
尽管队列之间在相对疾病严重程度排名上有良好的一致性,但每张图像更高的平均得分和分类表明,与2007年相比,2016年专家在疾病严重程度的更早阶段诊断附加病变前期和附加病变。这对患者护理、研究和教学有影响,需要更多研究来更好地理解基于图像的附加病变诊断中的这种时间趋势。