Liu Zhikai, Chen Wanqi, Guan Hui, Zhen Hongnan, Shen Jing, Liu Xia, Liu An, Li Richard, Geng Jianhao, You Jing, Wang Weihu, Li Zhouyu, Zhang Yongfeng, Chen Yuanyuan, Du Junjie, Chen Qi, Chen Yu, Wang Shaobin, Zhang Fuquan, Qiu Jie
Department of Radiation Oncology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Department of Nuclear Medicine, Sun Yat-Sen University Cancer Center, Guangzhou, China.
Front Oncol. 2021 Aug 19;11:702270. doi: 10.3389/fonc.2021.702270. eCollection 2021.
To propose a novel deep-learning-based auto-segmentation model for CTV delineation in cervical cancer and to evaluate whether it can perform comparably well to manual delineation by a three-stage multicenter evaluation framework.
An adversarial deep-learning-based auto-segmentation model was trained and configured for cervical cancer CTV contouring using CT data from 237 patients. Then CT scans of additional 20 consecutive patients with locally advanced cervical cancer were collected to perform a three-stage multicenter randomized controlled evaluation involving nine oncologists from six medical centers. This evaluation system is a combination of objective performance metrics, radiation oncologist assessment, and finally the head-to-head Turing imitation test. Accuracy and effectiveness were evaluated step by step. The intra-observer consistency of each oncologist was also tested.
In stage-1 evaluation, the mean DSC and the 95HD value of the proposed model were 0.88 and 3.46 mm, respectively. In stage-2, the oncologist grading evaluation showed the majority of AI contours were comparable to the GT contours. The average CTV scores for AI and GT were 2.68 2.71 in week 0 ( = .206), and 2.62 2.63 in week 2 ( = .552), with no significant statistical differences. In stage-3, the Turing imitation test showed that the percentage of AI contours, which were judged to be better than GT contours by ≥5 oncologists, was 60.0% in week 0 and 42.5% in week 2. Most oncologists demonstrated good consistency between the 2 weeks ( > 0.05).
The tested AI model was demonstrated to be accurate and comparable to the manual CTV segmentation in cervical cancer patients when assessed by our three-stage evaluation framework.
提出一种基于深度学习的宫颈癌临床靶区(CTV)自动分割模型,并通过一个三阶段多中心评估框架评估其性能是否能与手动分割相媲美。
使用来自237例患者的CT数据训练并配置一个基于对抗深度学习的自动分割模型,用于宫颈癌CTV轮廓勾画。然后收集另外20例局部晚期宫颈癌患者的CT扫描数据,进行一个三阶段多中心随机对照评估,该评估涉及来自六个医疗中心的九名肿瘤放疗科医生。这个评估系统是客观性能指标、放射肿瘤学家评估以及最后的双盲图灵模拟测试的组合。逐步评估准确性和有效性。还测试了每位肿瘤放疗科医生的观察者内一致性。
在第一阶段评估中,所提出模型的平均骰子相似性系数(DSC)和95% 豪斯多夫距离(95HD)值分别为0.88和3.46毫米。在第二阶段,肿瘤放疗科医生分级评估显示,大多数人工智能轮廓与手动勾画轮廓相当。在第0周,人工智能和手动勾画的平均CTV评分为2.68±2.71(P = 0.206),在第2周为2.62±2.63(P = 0.552),无显著统计学差异。在第三阶段,图灵模拟测试显示,在第0周,被≥5名肿瘤放疗科医生判定优于手动勾画轮廓的人工智能轮廓百分比为60.0%,在第2周为42.5%。大多数肿瘤放疗科医生在这两周之间表现出良好的一致性(P>0.05)。
通过我们的三阶段评估框架评估时,所测试的人工智能模型在宫颈癌患者中被证明是准确的,并且与手动CTV分割相当。