Boyd Aidan, Ye Zezhong, Prabhu Sanjay, Tjong Michael C, Zha Yining, Zapaishchykova Anna, Vajapeyam Sridhar, Hayat Hasaan, Chopra Rishi, Liu Kevin X, Nabavidazeh Ali, Resnick Adam, Mueller Sabine, Haas-Kogan Daphne, Aerts Hugo J W L, Poussaint Tina, Kann Benjamin H
Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA.
Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women's Hospital, Harvard Medical School, Boston, MA.
medRxiv. 2023 Sep 18:2023.06.29.23292048. doi: 10.1101/2023.06.29.23292048.
Artificial intelligence (AI)-automated tumor delineation for pediatric gliomas would enable real-time volumetric evaluation to support diagnosis, treatment response assessment, and clinical decision-making. Auto-segmentation algorithms for pediatric tumors are rare, due to limited data availability, and algorithms have yet to demonstrate clinical translation.
We leveraged two datasets from a national brain tumor consortium (n=184) and a pediatric cancer center (n=100) to develop, externally validate, and clinically benchmark deep learning neural networks for pediatric low-grade glioma (pLGG) segmentation using a novel in-domain, stepwise transfer learning approach. The best model [via Dice similarity coefficient (DSC)] was externally validated and subject to randomized, blinded evaluation by three expert clinicians wherein clinicians assessed clinical acceptability of expert- and AI-generated segmentations via 10-point Likert scales and Turing tests.
The best AI model utilized in-domain, stepwise transfer learning (median DSC: 0.877 [IQR 0.715-0.914]) versus baseline model (median DSC 0.812 [IQR 0.559-0.888]; <0.05). On external testing (n=60), the AI model yielded accuracy comparable to inter-expert agreement (median DSC: 0.834 [IQR 0.726-0.901] vs. 0.861 [IQR 0.795-0.905], =0.13). On clinical benchmarking (n=100 scans, 300 segmentations from 3 experts), the experts rated the AI model higher on average compared to other experts (median Likert rating: 9 [IQR 7-9]) vs. 7 [IQR 7-9], <0.05 for each). Additionally, the AI segmentations had significantly higher (<0.05) overall acceptability compared to experts on average (80.2% vs. 65.4%). Experts correctly predicted the origins of AI segmentations in an average of 26.0% of cases.
Stepwise transfer learning enabled expert-level, automated pediatric brain tumor auto-segmentation and volumetric measurement with a high level of clinical acceptability. This approach may enable development and translation of AI imaging segmentation algorithms in limited data scenarios.
用于小儿胶质瘤的人工智能(AI)自动肿瘤轮廓描绘能够实现实时体积评估,以支持诊断、治疗反应评估和临床决策。由于数据可用性有限,小儿肿瘤的自动分割算法很少见,并且算法尚未证明其临床实用性。
我们利用来自一个国家脑肿瘤联盟(n = 184)和一个儿科癌症中心(n = 100)的两个数据集,采用一种新颖的领域内逐步迁移学习方法,开发、外部验证并临床评估用于小儿低级别胶质瘤(pLGG)分割的深度学习神经网络。最佳模型[通过Dice相似系数(DSC)]进行了外部验证,并由三位专家临床医生进行随机、盲法评估,临床医生通过10分制李克特量表和图灵测试评估专家和AI生成的分割的临床可接受性。
最佳AI模型采用领域内逐步迁移学习(中位数DSC:0.877 [四分位距0.715 - 0.914]),而基线模型为(中位数DSC 0.812 [四分位距0.559 - 0.888];<0.05)。在外部测试(n = 60)中,AI模型的准确率与专家间的一致性相当(中位数DSC:0.834 [四分位距0.726 - 0.901] 对 0.861 [四分位距0.795 - 0.905],P = 0.13)。在临床评估(n = 100次扫描,来自3位专家的300次分割)中,与其他专家相比,专家对AI模型的平均评分更高(中位数李克特评分:9 [四分位距7 - 9])对7 [四分位距7 - 9],每次比较P < 0.05)。此外,AI分割的总体可接受性平均显著高于专家(80.2% 对 65.4%)。专家在平均26.0%的病例中正确预测了AI分割的来源。
逐步迁移学习实现了专家级的小儿脑肿瘤自动分割和体积测量,具有较高的临床可接受性。这种方法可能有助于在有限数据场景中开发和应用AI成像分割算法。