Menze Bjoern H, Jakab Andras, Bauer Stefan, Kalpathy-Cramer Jayashree, Farahani Keyvan, Kirby Justin, Burren Yuliya, Porz Nicole, Slotboom Johannes, Wiest Roland, Lanczi Levente, Gerstner Elizabeth, Weber Marc-André, Arbel Tal, Avants Brian B, Ayache Nicholas, Buendia Patricia, Collins D Louis, Cordier Nicolas, Corso Jason J, Criminisi Antonio, Das Tilak, Delingette Hervé, Demiralp Çağatay, Durst Christopher R, Dojat Michel, Doyle Senan, Festa Joana, Forbes Florence, Geremia Ezequiel, Glocker Ben, Golland Polina, Guo Xiaotao, Hamamci Andac, Iftekharuddin Khan M, Jena Raj, John Nigel M, Konukoglu Ender, Lashkari Danial, Mariz José Antonió, Meier Raphael, Pereira Sérgio, Precup Doina, Price Stephen J, Raviv Tammy Riklin, Reza Syed M S, Ryan Michael, Sarikaya Duygu, Schwartz Lawrence, Shin Hoo-Chang, Shotton Jamie, Silva Carlos A, Sousa Nuno, Subbanna Nagesh K, Szekely Gabor, Taylor Thomas J, Thomas Owen M, Tustison Nicholas J, Unal Gozde, Vasseur Flor, Wintermark Max, Ye Dong Hye, Zhao Liang, Zhao Binsheng, Zikic Darko, Prastawa Marcel, Reyes Mauricio, Van Leemput Koen
IEEE Trans Med Imaging. 2015 Oct;34(10):1993-2024. doi: 10.1109/TMI.2014.2377694. Epub 2014 Dec 4.
In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients-manually annotated by up to four raters-and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%-85%), illustrating the difficulty of this task. We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously. Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements. The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource.
在本文中,我们报告了与2012年和2013年医学图像计算与计算机辅助干预国际会议(MICCAI)联合举办的多模态脑肿瘤图像分割基准测试(BRATS)的设置和结果。二十种最先进的肿瘤分割算法被应用于一组65例低级别和高级别胶质瘤患者的多对比度磁共振扫描图像(由多达四名评估人员进行手动标注)以及65例使用肿瘤图像模拟软件生成的可比扫描图像。定量评估显示,在分割各种肿瘤子区域时,人类评估人员之间存在相当大的分歧(Dice分数在74%-85%之间),这说明了这项任务的难度。我们发现,不同的算法在不同的子区域表现最佳(达到与人类评估人员之间的变异性相当的性能),但没有一种算法能在所有子区域同时排名靠前。使用分层多数投票融合几种优秀算法得到的分割结果始终高于所有单个算法,这表明在方法改进方面仍有机会。BRATS图像数据和手动标注通过在线评估系统继续公开提供,作为一个持续的基准测试资源。