Suppr超能文献

评价自动分割算法的参考标准:MRI 上手动勾画前列腺轮廓的观察者间变异性的量化。

Reference standard for the evaluation of automatic segmentation algorithms: Quantification of inter observer variability of manual delineation of prostate contour on MRI.

机构信息

Department of Radiology, Hôpitaux Universitaire de Strasbourg, Hôpital de Hautepierre, 67200, Strasbourg, France; Breast and Thyroid Imaging Unit, Institut de Cancérologie Strasbourg Europe, 67200, Strasbourg, France; IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400, Illkirch, France.

Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, 06902, Nice, France.

出版信息

Diagn Interv Imaging. 2024 Feb;105(2):65-73. doi: 10.1016/j.diii.2023.08.001. Epub 2023 Aug 21.

Abstract

PURPOSE

The purpose of this study was to investigate the relationship between inter-reader variability in manual prostate contour segmentation on magnetic resonance imaging (MRI) examinations and determine the optimal number of readers required to establish a reliable reference standard.

MATERIALS AND METHODS

Seven radiologists with various experiences independently performed manual segmentation of the prostate contour (whole-gland [WG] and transition zone [TZ]) on 40 prostate MRI examinations obtained in 40 patients. Inter-reader variability in prostate contour delineations was estimated using standard metrics (Dice similarity coefficient [DSC], Hausdorff distance and volume-based metrics). The impact of the number of readers (from two to seven) on segmentation variability was assessed using pairwise metrics (consistency) and metrics with respect to a reference segmentation (conformity), obtained either with majority voting or simultaneous truth and performance level estimation (STAPLE) algorithm.

RESULTS

The average segmentation DSC for two readers in pairwise comparison was 0.919 for WG and 0.876 for TZ. Variability decreased with the number of readers: the interquartile ranges of the DSC were 0.076 (WG) / 0.021 (TZ) for configurations with two readers, 0.005 (WG) / 0.012 (TZ) for configurations with three readers, and 0.002 (WG) / 0.0037 (TZ) for configurations with six readers. The interquartile range decreased slightly faster between two and three readers than between three and six readers. When using consensus methods, variability often reached its minimum with three readers (with STAPLE, DSC = 0.96 [range: 0.945-0.971] for WG and DSC = 0.94 [range: 0.912-0.957] for TZ, and interquartile range was minimal for configurations with three readers.

CONCLUSION

The number of readers affects the inter-reader variability, in terms of inter-reader consistency and conformity to a reference. Variability is minimal for three readers, or three readers represent a tipping point in the variability evolution, with both pairwise-based metrics or metrics with respect to a reference. Accordingly, three readers may represent an optimal number to determine references for artificial intelligence applications.

摘要

目的

本研究旨在探讨磁共振成像(MRI)检查中手动前列腺轮廓分割的读者间变异性,并确定建立可靠参考标准所需的最佳读者数量。

材料与方法

7 名具有不同经验的放射科医生对 40 名患者的 40 次前列腺 MRI 检查分别进行了前列腺轮廓(全腺[WG]和移行区[TZ])的手动分割。使用标准指标(Dice 相似系数[DSC]、Hausdorff 距离和基于体积的指标)评估前列腺轮廓勾画的读者间变异性。使用两两比较指标(一致性)和参考分割指标(一致性)评估读者数量(从 2 名到 7 名)对分割变异性的影响,参考分割采用多数投票或同时真实和性能水平估计(STAPLE)算法获得。

结果

两名读者的平均分割 DSC 为 0.919(WG)和 0.876(TZ)。随着读者数量的增加,变异性降低:DSC 的四分位间距为 0.076(WG)/0.021(TZ)(两名读者)、0.005(WG)/0.012(TZ)(三名读者)和 0.002(WG)/0.0037(TZ)(六名读者)。两名读者和三名读者之间的四分位间距变化比三名读者和六名读者之间的变化稍快。使用一致性方法时,变异性通常在三名读者时达到最小值(使用 STAPLE,全腺的 DSC=0.96[范围:0.945-0.971],TZ 的 DSC=0.94[范围:0.912-0.957],四分位间距在三名读者的配置中最小)。

结论

读者数量会影响读者间的一致性和与参考标准的一致性,从而影响读者间的变异性。三名读者时变异性最小,或者三名读者代表变异性演变的临界点,无论是基于两两比较的指标还是参考指标都是如此。因此,三名读者可能是确定人工智能应用参考标准的最佳人数。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验