Department of Radiology, New York University Langone Medical Center, 660 First Ave, New York, NY 10016.
Department of Radiology, University of Michigan Health Systems, Ann Arbor, MI.
AJR Am J Roentgenol. 2021 Nov;217(5):1132-1140. doi: 10.2214/AJR.21.25456. Epub 2021 Apr 14.
Multiple commercial and open-source software applications are available for texture analysis. Nonstandard techniques can cause undesirable variability that impedes result reproducibility and limits clinical utility. The purpose of this study is to measure agreement of texture metrics extracted by six software packages. This retrospective study included 40 renal cell carcinomas with contrast-enhanced CT from The Cancer Genome Atlas and Imaging Archive. Images were analyzed by seven readers at six sites. Each reader used one of six software packages to extract commonly studied texture features. Inter- and intrareader agreement for segmentation was assessed with intraclass correlation coefficients (ICCs). First-order (available in six packages) and second-order (available in three packages) texture features were compared between software pairs using Pearson correlation. Inter- and intrareader agreement was excellent (ICC, 0.93-1). First-order feature correlations were strong ( ≥ 0.8, < .001) between 75% (21/28) of software pairs for mean intensity and SD, 48% (10/21) for entropy, 29% (8/28) for skewness, and 25% (7/28) for kurtosis. Of 15 second-order features, only cooccurrence matrix correlation, gray-level nonuniformity, and run-length nonuniformity showed strong correlation between software packages ( = 0.90-1, < .001). Variability in first- and second-order texture features was common across software configurations and produced inconsistent results. Standardized algorithms and reporting methods are needed before texture data can be reliably used for clinical applications. It is important to be aware of variability related to texture software processing and configuration when reporting and comparing outputs.
有多种商业和开源软件应用程序可用于纹理分析。非标准技术可能会导致不理想的可变性,从而阻碍结果的可重复性并限制临床实用性。本研究的目的是测量六种软件包提取的纹理指标的一致性。这项回顾性研究包括来自癌症基因组图谱和成像档案的 40 例增强 CT 的肾细胞癌。图像由六个地点的七位读者进行分析。每位读者使用六种软件包中的一种来提取常见的纹理特征。使用组内相关系数 (ICC) 评估分割的读者间和读者内一致性。使用 Pearson 相关性比较软件包之间的一阶(在六个软件包中提供)和二阶(在三个软件包中提供)纹理特征。读者间和读者内的一致性非常好 (ICC,0.93-1)。一阶特征相关性很强(≥0.8,<0.001),在 75%(21/28)的软件对中,均值和标准差、熵、偏度和峰度的相关性分别为 48%(10/21)、48%(10/21)、29%(8/28)和 25%(7/28)。在 15 个二阶特征中,只有共生矩阵相关性、灰度不均匀性和行程长度不均匀性在软件包之间显示出很强的相关性(=0.90-1,<0.001)。一阶和二阶纹理特征的变异性在所有软件配置中都很常见,并且产生了不一致的结果。在纹理数据可用于临床应用之前,需要标准化算法和报告方法。在报告和比较输出时,了解与纹理软件处理和配置相关的变异性非常重要。