Foy Joseph J, Robinson Kayla R, Li Hui, Giger Maryellen L, Al-Hallaq Hania, Armato Samuel G
University of Chicago, Department of Radiology, Chicago, Illinois, United States.
University of Chicago, Department of Radiation and Cellular Oncology, Chicago, Illinois, United States.
J Med Imaging (Bellingham). 2018 Oct;5(4):044505. doi: 10.1117/1.JMI.5.4.044505. Epub 2018 Dec 4.
Given the increased need for consistent quantitative image analysis, variations in radiomics feature calculations due to differences in radiomics software were investigated. Two in-house radiomics packages and two freely available radiomics packages, MaZda and IBEX, were utilized. Forty regions of interest (ROIs) from 40 digital mammograms were studied along with 39 manually delineated ROIs from the head and neck (HN) computed tomography (CT) scans of 39 patients. Each package was used to calculate first-order histogram and second-order gray-level co-occurrence matrix (GLCM) features. Friedman tests determined differences in feature values across packages, whereas intraclass-correlation coefficients (ICC) quantified agreement. All first-order features computed from both mammography and HN cases (except skewness in mammography) showed significant differences across all packages due to systematic biases introduced by each package; however, based on ICC values, all but one first-order feature calculated on mammography ROIs and all but two first-order features calculated on HN CT ROIs showed excellent agreement, indicating the observed differences were small relative to the feature values but the bias was systematic. All second-order features computed from the two databases both differed significantly and showed poor agreement among packages, due largely to discrepancies in package-specific default GLCM parameters. Additional differences in radiomics features were traced to variations in image preprocessing, algorithm implementation, and naming conventions. Large variations in features among software packages indicate that increased efforts to standardize radiomics processes must be conducted.
鉴于对一致的定量图像分析的需求增加,研究了由于放射组学软件差异导致的放射组学特征计算的变化。使用了两个内部放射组学软件包以及两个免费的放射组学软件包MaZda和IBEX。研究了来自40幅数字乳腺X线摄影的40个感兴趣区域(ROI),以及来自39例患者的头颈部(HN)计算机断层扫描(CT)的39个手动勾勒的ROI。每个软件包都用于计算一阶直方图和二阶灰度共生矩阵(GLCM)特征。Friedman检验确定了各软件包之间特征值的差异,而组内相关系数(ICC)对一致性进行了量化。从乳腺摄影和HN病例计算出的所有一阶特征(乳腺摄影中的偏度除外)在所有软件包之间均显示出显著差异,这是由于每个软件包引入的系统偏差所致;然而,基于ICC值,乳腺摄影ROI上计算的除一个一阶特征外的所有特征以及HN CT ROI上计算的除两个一阶特征外的所有特征均显示出极好的一致性,表明观察到的差异相对于特征值较小,但偏差是系统性的。从两个数据库计算出的所有二阶特征在各软件包之间均存在显著差异且一致性较差,这主要是由于特定软件包的默认GLCM参数存在差异。放射组学特征的其他差异可追溯到图像预处理、算法实现和命名约定的变化。软件包之间特征的巨大差异表明,必须加大力度规范放射组学流程。