Tang Olivia, Xu Yuchen, Tang Yucheng, Lee Ho Hin, Chen Yunqiang, Gao Dashan, Han Shizhong, Gao Riqiang, Savona Michael R, Abramson Richard G, Huo Yuankai, Landman Bennett A
Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212.
12 Sigma Technologies, San Diego, CA, USA 92130.
Proc SPIE Int Soc Opt Eng. 2020;11313. doi: 10.1117/12.2549035. Epub 2020 Mar 10.
Segmentation of abdominal computed tomography (CT) provides spatial context, morphological properties, and a framework for tissue-specific radiomics to guide quantitative Radiological assessment. A 2015 MICCAI challenge spurred substantial innovation in multi-organ abdominal CT segmentation with both traditional and deep learning methods. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, continued cross-validation on open datasets presents the risk of indirect knowledge contamination and could result in circular reasoning. Moreover, "real world" segmentations can be challenging due to the wide variability of abdomen physiology within patients. Herein, we perform two data retrievals to capture clinically acquired deidentified abdominal CT cohorts with respect to a recently published variation on 3D U-Net (baseline algorithm). First, we retrieved 2004 deidentified studies on 476 patients with diagnosis codes involving spleen abnormalities (cohort A). Second, we retrieved 4313 deidentified studies on 1754 patients without diagnosis codes involving spleen abnormalities (cohort B). We perform prospective evaluation of the existing algorithm on both cohorts, yielding 13% and 8% failure rate, respectively. Then, we identified 51 subjects in cohort A with segmentation failures and manually corrected the liver and gallbladder labels. We re-trained the model adding the manual labels, resulting in performance improvement of 9% and 6% failure rate for the A and B cohorts, respectively. In summary, the performance of the baseline on the prospective cohorts was similar to that on previously published datasets. Moreover, adding data from the first cohort substantively improved performance when evaluated on the second withheld validation cohort.
腹部计算机断层扫描(CT)的分割提供了空间背景、形态学特性以及组织特异性放射组学的框架,以指导定量放射学评估。2015年的医学图像计算方法国际会议(MICCAI)挑战赛激发了传统方法和深度学习方法在多器官腹部CT分割方面的大量创新。深度方法的最新创新已将性能提升至具有临床转化吸引力的水平。然而,在开放数据集上持续进行交叉验证存在间接知识污染的风险,可能导致循环推理。此外,由于患者腹部生理结构的广泛变异性,“真实世界”的分割可能具有挑战性。在此,我们进行了两次数据检索,以获取与最近发表的3D U-Net(基线算法)变体相关的临床获取的去识别腹部CT队列。首先,我们检索了2004份关于476例诊断代码涉及脾脏异常患者的去识别研究(队列A)。其次,我们检索了4313份关于1754例诊断代码不涉及脾脏异常患者的去识别研究(队列B)。我们对这两个队列中的现有算法进行了前瞻性评估,队列A和队列B的失败率分别为13%和8%。然后,我们在队列A中识别出51例分割失败的受试者,并手动校正了肝脏和胆囊标签。我们添加手动标签重新训练模型,结果队列A和队列B的失败率分别提高了9%和6%。总之,基线在前瞻性队列中的性能与之前发表的数据集相似。此外,在第二个保留验证队列上进行评估时,添加来自第一个队列的数据显著提高了性能。