School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, 510515, China.
Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, Guangdong, 510515, China.
Eur Radiol. 2023 Apr;33(4):2426-2438. doi: 10.1007/s00330-022-09229-w. Epub 2022 Nov 10.
To develop a deep learning-based harmonization framework, assessing whether it can improve performance of radiomics models given different kernels in different clinical tasks and additionally generalize to mitigate the effects of new/unobserved kernels on radiomics features.
Patient data with 2 reconstruction kernels and phantom data with 22 reconstruction kernels were included. Eighty-five patients were studied for lymph node metastasis (LNM) prediction, and 164 patients for differential diagnosis between lung cancer (LC) and pulmonary tuberculosis (TB). Two convolutional neural network (CNN) models were developed to convert images (i) from B70f to B30f (CNNa) and (ii) from B30f to B70f (CNNb). Model performance between the two kernels was evaluated using AUC and compared with other well-known harmonization methods. Patient-normalized feature difference (PNFD) was used to identify the incompatible kernels (i.e., kernel with median PNFD > 1) with baseline (B30f/B70f), and measure the ability of the CNN models to convert the non-comparable kernels.
For LC versus pulmonary TB diagnosis, AUCs of CNNa vs. others were 0.85 vs. 0.54-0.74 (p = 0.0001-0.0003), and for CNNb vs. others: 0.87 vs. 0.54-0.86 (p = 0.0001-0.55). For LNM prediction, AUCs of CNNa vs. others were 0.68 vs. 0.56-0.61 (p = 0.10-0.39), and for CNNb vs. others: 0.78 vs. 0.70-0.73 (p = 0.07-0.40). After CNN harmonization, 17 of 20 (85%) of investigated unknown kernels produced comparable radiomics feature values relative to baseline (median PNFD from 1.10-2.31 to 0.23-1.13).
The CNN harmonization effectively improved performance of radiomics models between reconstruction kernels in different clinical tasks, and reduced feature differences between unknown kernels vs. baseline.
• The soft (B30f) and sharp (B70f) kernels strongly affect radiomics reproducibility and generalizability. • The convolutional neural network (CNN) harmonization methods performed better than location-scale (ComBat and centering-scaling) and matrix factorization harmonization methods (based on singular value decomposition (SVD) and independent component analysis (ICA)) in both clinical tasks. • The CNN harmonization methods improve feature reproducibility not only between specific kernels (B30f and B70f) from the same scanner, but also between unobserved kernels from different scanners of different vendors.
开发一种基于深度学习的调和框架,评估其是否可以提高不同临床任务中不同核的放射组学模型的性能,并通过泛化来减轻新的/未观察到的核对放射组学特征的影响。
纳入了 2 种重建核的患者数据和 22 种重建核的体模数据。对 85 例患者进行淋巴结转移(LNM)预测,对 164 例患者进行肺癌(LC)与肺结核(TB)的鉴别诊断。开发了 2 个卷积神经网络(CNN)模型,用于转换图像(i)从 B70f 到 B30f(CNNa)和(ii)从 B30f 到 B70f(CNNb)。使用 AUC 评估两种核之间的模型性能,并与其他知名的调和方法进行比较。使用患者归一化特征差异(PNFD)来识别与基线(B30f/B70f)不兼容的核(即,中位数 PNFD > 1 的核),并衡量 CNN 模型转换不可比较核的能力。
对于 LC 与肺结核的诊断,CNNa 与其他方法的 AUC 分别为 0.85 与 0.54-0.74(p = 0.0001-0.0003),CNNb 与其他方法的 AUC 分别为 0.87 与 0.54-0.86(p = 0.0001-0.55)。对于 LNM 预测,CNNa 与其他方法的 AUC 分别为 0.68 与 0.56-0.61(p = 0.10-0.39),CNNb 与其他方法的 AUC 分别为 0.78 与 0.70-0.73(p = 0.07-0.40)。经过 CNN 调和后,20 个未知核中的 17 个(85%)相对于基线产生了可比的放射组学特征值(中位数 PNFD 从 1.10-2.31 降至 0.23-1.13)。
CNN 调和方法有效地提高了不同临床任务中重建核之间的放射组学模型性能,并降低了未知核与基线之间的特征差异。
软(B30f)和硬(B70f)核对放射组学的可重复性和可推广性有很大影响。
在两种临床任务中,卷积神经网络(CNN)调和方法的性能均优于位置-尺度(ComBat 和中心化-尺度)调和方法和矩阵分解调和方法(基于奇异值分解(SVD)和独立成分分析(ICA))。
CNN 调和方法不仅可以提高特定核(来自同一扫描仪的 B30f 和 B70f)之间的特征可重复性,而且可以提高来自不同供应商的不同扫描仪的未观察核之间的特征可重复性。