Amsterdam UMC, University of Amsterdam, Department of Urology, Meibergdreef 9, Amsterdam, The Netherlands.
Department of Urology, Netherlands Cancer Institute - Antoni van Leeuwenhoek Hospital, Amsterdam, The Netherlands.
Virchows Arch. 2023 Aug;483(2):197-206. doi: 10.1007/s00428-023-03589-4. Epub 2023 Jul 6.
The development of artificial intelligence-based imaging techniques for prostate cancer (PCa) detection and diagnosis requires a reliable ground truth, which is generally based on histopathology from radical prostatectomy specimens. This study proposes a comprehensive protocol for the annotation of prostatectomy pathology slides. To evaluate the reliability of the protocol, interobserver variability was assessed between five pathologists, who annotated ten radical prostatectomy specimens consisting of 74 whole mount pathology slides. Interobserver variability was assessed for both the localization and grading of PCa. The results indicate excellent overall agreement on the localization of PCa (Gleason pattern ≥ 3) and clinically significant PCa (Gleason pattern ≥ 4), with Dice similarity coefficients (DSC) of 0.91 and 0.88, respectively. On a per-slide level, agreement for primary and secondary Gleason pattern was almost perfect and substantial, with Fleiss Kappa of .819 (95% CI .659-.980) and .726 (95% CI .573-.878), respectively. Agreement on International Society of Urological Pathology Grade Group was evaluated for the index lesions and showed agreement in 70% of cases, with a mean DSC of 0.92 for all index lesions. These findings show that a standardized protocol for prostatectomy pathology annotation provides reliable data on PCa localization and grading, with relatively high levels of interobserver agreement. More complicated tissue characterization, such as the presence of cribriform growth and intraductal carcinoma, remains a source of interobserver variability and should be treated with care when used in ground truth datasets.
基于人工智能的前列腺癌(PCa)检测和诊断成像技术的发展需要可靠的真实数据,通常基于根治性前列腺切除术标本的组织病理学。本研究提出了一种全面的前列腺切除术病理切片注释方案。为了评估该方案的可靠性,评估了五位病理学家之间的观察者间变异性,他们对十个根治性前列腺切除术标本进行了注释,这些标本包含 74 张全层病理切片。评估了观察者间对 PCa 的定位和分级的变异性。结果表明,在 PCa(Gleason 模式≥3)和临床显著 PCa(Gleason 模式≥4)的定位方面,总体具有极好的一致性,Dice 相似系数(DSC)分别为 0.91 和 0.88。在每张幻灯片的水平上,主要和次要 Gleason 模式的一致性几乎是完美和实质性的,Fleiss Kappa 分别为.819(95%置信区间.659-.980)和.726(95%置信区间.573-.878)。评估了国际泌尿病理学会分级组在索引病变中的一致性,70%的病例存在一致性,所有索引病变的平均 DSC 为 0.92。这些发现表明,前列腺切除术病理注释的标准化方案可提供关于 PCa 定位和分级的可靠数据,观察者间的一致性相对较高。更复杂的组织特征,如筛状生长和导管内癌的存在,仍然是观察者间变异性的来源,在用于真实数据集时应谨慎处理。