Konkel Brandon, Macdonald Jacob, Lafata Kyle, Zaki Islam H, Bozdogan Erol, Chaudhry Mohammad, Wang Yuqi, Janas Gemini, Wiggins Walter F, Bashir Mustafa R
From the Department of Radiology (B.K., J.M., K.L., I.H.Z., E.B., M.C., G.J., W.F.W., M.R.B.), Department of Radiation Oncology (K.L.), and Department of Medicine, Division of Gastroenterology (M.R.B.), Duke University School of Medicine, Duke University Medical Center, Box 3808, Durham, NC 27710; Department of Electrical & Computer Engineering, Duke University Pratt School of Engineering, Durham, NC (K.L., Y.W.); Department of Radiology, Faculty of Medicine, Benha University, Benha, Egypt (I.H.Z.); Department of Radiology, College of Medicine-Tucson, University of Arizona, Tucson, AZ (E.B.); and Department of Radiology, Rutgers Health-Newark Beth Israel Medical Center, Newark, NJ (M.C.).
Radiol Artif Intell. 2023 Feb 22;5(3):e220080. doi: 10.1148/ryai.220080. eCollection 2023 May.
To investigate the effect of training data type on generalizability of deep learning liver segmentation models.
This Health Insurance Portability and Accountability Act-compliant retrospective study included 860 MRI and CT abdominal scans obtained between February 2013 and March 2018 and 210 volumes from public datasets. Five single-source models were trained on 100 scans each of T1-weighted fat-suppressed portal venous (dynportal), T1-weighted fat-suppressed precontrast (dynpre), proton density opposed-phase (opposed), single-shot fast spin-echo (ssfse), and T1-weighted non-fat-suppressed (t1nfs) sequence types. A sixth multisource (DeepAll) model was trained on 100 scans consisting of 20 randomly selected scans from each of the five source domains. All models were tested against 18 target domains from unseen vendors, MRI types, and modality (CT). The Dice-Sørensen coefficient (DSC) was used to quantify similarity between manual and model segmentations.
Single-source model performance did not degrade significantly against unseen vendor data. Models trained on T1-weighted dynamic data generally performed well on other T1-weighted dynamic data (DSC = 0.848 ± 0.183 [SD]). The opposed model generalized moderately well to all unseen MRI types (DSC = 0.703 ± 0.229). The ssfse model failed to generalize well to any other MRI type (DSC = 0.089 ± 0.153). Dynamic and opposed models generalized moderately well to CT data (DSC = 0.744 ± 0.206), whereas other single-source models performed poorly (DSC = 0.181 ± 0.192). The DeepAll model generalized well across vendor, modality, and MRI type and against externally sourced data.
Domain shift in liver segmentation appears to be tied to variations in soft-tissue contrast and can be effectively bridged with diversification of soft-tissue representation in training data. Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms, Supervised Learning, CT, MRI, Liver Segmentation . © RSNA, 2023.
研究训练数据类型对深度学习肝脏分割模型泛化能力的影响。
这项符合《健康保险流通与责任法案》的回顾性研究纳入了2013年2月至2018年3月期间获取的860例腹部MRI和CT扫描图像以及来自公共数据集的210个容积数据。五个单源模型分别使用100例T1加权脂肪抑制门静脉期(动态门静脉期)、T1加权脂肪抑制平扫期(动态平扫期)、质子密度反相位、单次激发快速自旋回波和T1加权非脂肪抑制序列类型的扫描图像进行训练。第六个多源(DeepAll)模型使用从五个源域中每个域随机选择的20例扫描图像组成的100例扫描图像进行训练。所有模型均针对来自未知供应商、MRI类型和模态(CT)的18个目标域进行测试。采用Dice-Sørensen系数(DSC)来量化手动分割与模型分割之间的相似性。
单源模型对未知供应商数据的性能没有显著下降。在T1加权动态数据上训练的模型在其他T1加权动态数据上通常表现良好(DSC = 0.848 ± 0.183 [标准差])。反相位模型对所有未知MRI类型的泛化能力中等(DSC = 0.703 ± 0.229)。单次激发快速自旋回波模型对任何其他MRI类型的泛化能力都很差(DSC = 0.089 ± 0.153)。动态模型和反相位模型对CT数据的泛化能力中等(DSC = 0.744 ± 0.206),而其他单源模型表现较差(DSC = 0.181 ± 0.192)。DeepAll模型在供应商、模态和MRI类型以及外部来源数据方面都具有良好的泛化能力。
肝脏分割中的域转移似乎与软组织对比度的变化有关,并且可以通过训练数据中软组织表示的多样化有效地弥合。卷积神经网络(CNN)、深度学习算法、机器学习算法、监督学习、CT、MRI、肝脏分割。© RSNA,2023。