Bae Sang Min, Kim Dong Hwan, Kang Ji Hun
Hanyang University Guri Hospital, Guri-si, Korea, Republic of.
Asan Medical Center, Seoul, Republic of Korea.
Abdom Radiol (NY). 2025 Jan 22. doi: 10.1007/s00261-025-04813-2.
Ovarian-Adnexal Reporting and Data System (O-RADS) US provides a standardized lexicon for ovarian and adnexal lesions, facilitating risk stratification based on morphological features for malignancy assessment, which is essential for proper management. However, systematic determination of inter-reader reliability in O-RADS US categorization remains unexplored. This study aimed to systematically determine the inter-reader reliability of O-RADS US categorization and identify the factors that affect it.
Original articles reporting the inter-reader reliability of O-RADS US in lesion categorization were identified in the MEDLINE, EMBASE, and Web of Science databases from January 2018 to December 2023. DerSimonian-Laird random-effects models were used to determine the meta-analytic pooled inter-reader reliability of the O-RADS US categorization. Subgroup meta-regression analysis was performed to identify the factors causing study heterogeneity.
Fourteen original articles with 5139 ovarian and adnexal lesions were included. The inter-reader reliability of O-RADS US in lesion categorization ranged from 0.71 to 0.99, with a meta-analytic pooled estimate of 0.83 (95% CI, 0.78-0.88), indicating almost perfect reliability. Substantial study heterogeneity was observed in the inter-reader reliability of the O-RADS US categorization (I = 96.9). In subgroup meta-regression analysis, reader experience was the only factor associated with study heterogeneity. Pooled inter-reader reliability of the O-RADS US categorization was higher in studies with all experienced readers (0.86; 95% CI, 0.81-0.91) compared to those with multiple readers including trainees (0.74; 95% CI, 0.70-0.78; P = 0.009). The inter-reader reliability of US descriptors ranged from 0.39 to 0.97, with ascites and peritoneal nodules showing almost perfect reliability (0.79- 0.97).
The O-RADS US risk stratification system demonstrated almost perfect inter-reader reliability in lesion categorization. Our results highlight the importance of targeted training and descriptor simplification to improve inter-reader reliability and clinical adoption.
卵巢附件报告和数据系统(O-RADS)超声为卵巢和附件病变提供了标准化词汇,便于根据形态学特征进行风险分层以评估恶性肿瘤,这对恰当管理至关重要。然而,O-RADS超声分类中阅片者间可靠性的系统测定仍未得到探索。本研究旨在系统测定O-RADS超声分类的阅片者间可靠性,并识别影响其的因素。
在2018年1月至2023年12月期间,在MEDLINE、EMBASE和科学网数据库中检索报告O-RADS超声在病变分类中阅片者间可靠性的原始文章。采用DerSimonian-Laird随机效应模型来确定O-RADS超声分类的荟萃分析合并阅片者间可靠性。进行亚组荟萃回归分析以识别导致研究异质性的因素。
纳入了14篇包含5139例卵巢和附件病变的原始文章。O-RADS超声在病变分类中的阅片者间可靠性范围为0.71至0.99,荟萃分析合并估计值为0.83(95%可信区间,0.78 - 0.88),表明可靠性几乎完美。在O-RADS超声分类的阅片者间可靠性方面观察到显著的研究异质性(I = 96.9)。在亚组荟萃回归分析中,阅片者经验是与研究异质性相关的唯一因素。与包含实习医生等多个阅片者的研究相比,所有阅片者均有经验的研究中O-RADS超声分类的合并阅片者间可靠性更高(0.86;95%可信区间,0.81 - 0.91)(0.74;95%可信区间,0.70 - 0.78;P = 0.009)。超声描述符的阅片者间可靠性范围为0.39至0.97,腹水和腹膜结节显示出几乎完美的可靠性(0.79 - 0.97)。
O-RADS超声风险分层系统在病变分类中显示出几乎完美的阅片者间可靠性。我们的结果强调了针对性培训和描述符简化对提高阅片者间可靠性和临床应用的重要性。