Suppr超能文献

磁共振成像上的多学科共识前列腺轮廓:人工智能基准测试的教育图谱和参考标准

Multidisciplinary Consensus Prostate Contours on Magnetic Resonance Imaging: Educational Atlas and Reference Standard for Artificial Intelligence Benchmarking.

作者信息

Song Yuze, Dornisch Anna M, Dess Robert T, Margolis Daniel J A, Weinberg Eric P, Barrett Tristan, Cornell Mariel, Fan Richard E, Harisinghani Mukesh, Kamran Sophia C, Lee Jeong Hoon, Li Cynthia Xinran, Liss Michael A, Rusu Mirabela, Santos Jason, Sonn Geoffrey A, Vidic Igor, Woolen Sean A, Dale Anders M, Seibert Tyler M

机构信息

Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California; Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, California.

Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California.

出版信息

Int J Radiat Oncol Biol Phys. 2025 Mar 26. doi: 10.1016/j.ijrobp.2025.03.024.

Abstract

PURPOSE

Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to: (1) create a reference standard data set with precise prostate contours by expert consensus, and (2) evaluate various AI tools against this standard.

METHODS AND MATERIALS

We obtained prostate magnetic resonance imaging cases from six institutions from the Qualitative Prostate Imaging Consortium. A panel of 4 experts (2 genitourinary radiologists and 2 prostate radiation oncologists) meticulously developed consensus prostate segmentations on axial T-weighted series. We evaluated the performance of 6 AI tools (3 commercially available and 3 academic) using Dice scores, distance from reference contour, and volume error.

RESULTS

The panel achieved consensus prostate segmentation on each slice of all 68 patient cases included in the reference data set. We present 2 patient examples to serve as contouring guides. Depending on the AI tool, median Dice scores (across patients) ranged from 0.80 to 0.94 for whole prostate segmentation. For a typical (median) patient, AI tools had a mean error over the prostate surface ranging from 1.3 to 2.4 mm. They maximally deviated 3.0 to 9.4 mm outside the prostate and 3.0 to 8.5 mm inside the prostate for a typical patient. Error in prostate volume measurement for a typical patient ranged from 4.3% to 31.4%.

CONCLUSIONS

We established an expert consensus benchmark for prostate segmentation. The best-performing AI tools have typical accuracy greater than that reported for radiation oncologists using computed tomography scans (the most common clinical approach for radiation therapy planning). Physician review remains essential to detect occasional major errors.

摘要

目的

由于缺乏金标准,评估用于前列腺分割的人工智能(AI)算法具有挑战性。我们旨在:(1)通过专家共识创建一个具有精确前列腺轮廓的参考标准数据集,以及(2)对照该标准评估各种AI工具。

方法和材料

我们从定性前列腺成像联盟的六个机构获取了前列腺磁共振成像病例。由4名专家(2名泌尿生殖放射科医生和2名前列腺放射肿瘤学家)组成的小组精心制定了轴位T加权序列上的前列腺分割共识。我们使用Dice分数、与参考轮廓的距离和体积误差评估了6种AI工具(3种商业可用工具和3种学术工具)的性能。

结果

该小组对参考数据集中纳入的所有68例患者病例的每一层都达成了前列腺分割共识。我们展示2例患者实例作为轮廓绘制指南。根据AI工具的不同,全前列腺分割的中位数Dice分数( across patients)在0.80至0.94之间。对于典型(中位数)患者,AI工具在前列腺表面的平均误差为1.3至2.4毫米。对于典型患者,它们在前列腺外最大偏差3.0至9.4毫米,在前列腺内最大偏差3.0至8.5毫米。典型患者前列腺体积测量的误差范围为4.3%至31.4%。

结论

我们建立了前列腺分割的专家共识基准。性能最佳的AI工具具有比放射肿瘤学家使用计算机断层扫描(放射治疗计划最常用的临床方法)报告的典型准确性更高的准确性。医生审查对于检测偶尔出现的重大误差仍然至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/396a/12243632/f5aef3042097/nihms-2092091-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验