Suppr超能文献

专家对深度学习算法在脑肿瘤分割中的评估。

Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation.

机构信息

From the Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology (K.V.H., C.P.B., A.K., K.I.L., K.C., J.P., B.R.R., E.R.G., J.K.C.), and Stephen E. and Catherine Pappas Center for Neuro-Oncology (O.A., A.K., K.I.L., E.R.G.), Massachusetts General Hospital, 149 13th St, Charlestown, MA 02129; Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Mass (K.V.H., K.C., J.P.); MGH and BWH Center for Clinical Data Science, Boston, Mass (C.P.B., J.K.C.); Department of Radiation Oncology, Division of Radiation Oncology (S.A., C.C.); Department of Diagnostic Radiology, Division of Diagnostic Imaging (C.C.), and Department of Neuroradiology (J.M.J.), Division of Diagnostic Imaging, The University of Texas MD Anderson Cancer Center, Houston, Tex; Departments of Radiology (R.Y.H.) and Neurology (T.T.B.), Brigham and Women's Hospital, Boston, Mass; Department of Radiology and Advanced Imaging Research Center, University of Texas Southwestern Medical Center, Dallas, Tex (M.P.); and Department of Ophthalmology, University of Colorado Anschutz Medical Campus, Aurora, Colo (J.K.C.).

出版信息

Radiol Artif Intell. 2024 Jan;6(1):e220231. doi: 10.1148/ryai.220231.

Abstract

Purpose To present results from a literature survey on practices in deep learning segmentation algorithm evaluation and perform a study on expert quality perception of brain tumor segmentation. Materials and Methods A total of 180 articles reporting on brain tumor segmentation algorithms were surveyed for the reported quality evaluation. Additionally, ratings of segmentation quality on a four-point scale were collected from medical professionals for 60 brain tumor segmentation cases. Results Of the surveyed articles, Dice score, sensitivity, and Hausdorff distance were the most popular metrics to report segmentation performance. Notably, only 2.8% of the articles included clinical experts' evaluation of segmentation quality. The experimental results revealed a low interrater agreement (Krippendorff α, 0.34) in experts' segmentation quality perception. Furthermore, the correlations between the ratings and commonly used quantitative quality metrics were low (Kendall tau between Dice score and mean rating, 0.23; Kendall tau between Hausdorff distance and mean rating, 0.51), with large variability among the experts. Conclusion The results demonstrate that quality ratings are prone to variability due to the ambiguity of tumor boundaries and individual perceptual differences, and existing metrics do not capture the clinical perception of segmentation quality. Brain Tumor Segmentation, Deep Learning Algorithms, Glioblastoma, Cancer, Machine Learning Clinical trial registration nos. NCT00756106 and NCT00662506 © RSNA, 2023.

摘要

目的

介绍深度学习分割算法评估实践的文献调查结果,并对脑肿瘤分割的专家质量感知进行研究。

材料与方法

对 180 篇报告脑肿瘤分割算法的文章进行了调查,以报告报告的质量评估。此外,还从医学专业人员那里收集了 60 个脑肿瘤分割病例的四分制分割质量评分。

结果

在所调查的文章中,Dice 评分、敏感性和 Hausdorff 距离是报告分割性能最常用的指标。值得注意的是,只有 2.8%的文章包括临床专家对分割质量的评估。实验结果显示,专家对分割质量感知的评分存在低的组内一致性(Krippendorff α,0.34)。此外,评分与常用定量质量指标之间的相关性较低(Dice 评分与平均评分之间的 Kendall tau,0.23;Hausdorff 距离与平均评分之间的 Kendall tau,0.51),专家之间存在很大的差异。

结论

结果表明,由于肿瘤边界的模糊性和个体感知差异,质量评分容易出现差异,并且现有指标无法捕捉分割质量的临床感知。脑肿瘤分割、深度学习算法、胶质母细胞瘤、癌症、机器学习。临床试验注册号:NCT00756106 和 NCT00662506。

© 2023 RSNA。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d51f/10831514/ea63b21eb137/ryai.220231.VA.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验