Fusco Roberta, Piccirillo Adele, Sansone Mario, Granata Vincenza, Rubulotta Maria Rosaria, Petrosino Teresa, Barretta Maria Luisa, Vallone Paolo, Di Giacomo Raimondo, Esposito Emanuela, Di Bonito Maurizio, Petrillo Antonella
Radiology Division, Istituto Nazionale Tumori-IRCCS-Fondazione G. Pascale, 80131 Naples, Italy.
Department of Electrical Engineering and Information Technologies, Università degli Studi di Napoli Federico II, 80125 Naples, Italy.
Diagnostics (Basel). 2021 Apr 30;11(5):815. doi: 10.3390/diagnostics11050815.
The aim of the study was to estimate the diagnostic accuracy of textural features extracted by dual-energy contrast-enhanced mammography (CEM) images, by carrying out univariate and multivariate statistical analyses including artificial intelligence approaches. In total, 80 patients with known breast lesion were enrolled in this prospective study according to regulations issued by the local Institutional Review Board. All patients underwent dual-energy CEM examination in both craniocaudally (CC) and double acquisition of mediolateral oblique (MLO) projections (early and late). The reference standard was pathology from a surgical specimen for malignant lesions and pathology from a surgical specimen or fine needle aspiration cytology, core or Tru-Cut needle biopsy, and vacuum assisted breast biopsy for benign lesions. In total, 104 samples of 80 patients were analyzed. Furthermore, 48 textural parameters were extracted by manually segmenting regions of interest. Univariate and multivariate approaches were performed: non-parametric Wilcoxon-Mann-Whitney test; receiver operating characteristic (ROC), linear classifier (LDA), decision tree (DT), k-nearest neighbors (KNN), artificial neural network (NNET), and support vector machine (SVM) were utilized. A balancing approach and feature selection methods were used. The univariate analysis showed low accuracy and area under the curve (AUC) for all considered features. Instead, in the multivariate textural analysis, the best performance considering the CC view (accuracy (ACC) = 0.75; AUC = 0.82) was reached with a DT trained with leave-one-out cross-variation (LOOCV) and balanced data (with adaptive synthetic (ADASYN) function) and a subset of three robust textural features (MAD, VARIANCE, and LRLGE). The best performance (ACC = 0.77; AUC = 0.83) considering the early-MLO view was reached with a NNET trained with LOOCV and balanced data (with ADASYN function) and a subset of ten robust features (MEAN, MAD, RANGE, IQR, VARIANCE, CORRELATION, RLV, COARSNESS, BUSYNESS, and STRENGTH). The best performance (ACC = 0.73; AUC = 0.82) considering the late-MLO view was reached with a NNET trained with LOOCV and balanced data (with ADASYN function) and a subset of eleven robust features (MODE, MEDIAN, RANGE, RLN, LRLGE, RLV, LZLGE, GLV_GLSZM, ZSV, COARSNESS, and BUSYNESS). Multivariate analyses using pattern recognition approaches, considering 144 textural features extracted from all three mammographic projections (CC, early MLO, and late MLO), optimized by adaptive synthetic sampling and feature selection operations obtained the best results (ACC = 0.87; AUC = 0.90) and showed the best performance in the discrimination of benign and malignant lesions.
本研究的目的是通过进行包括人工智能方法在内的单变量和多变量统计分析,评估双能对比增强乳腺摄影(CEM)图像提取的纹理特征的诊断准确性。根据当地机构审查委员会发布的规定,本前瞻性研究共纳入了80例已知乳腺病变的患者。所有患者均在头尾位(CC)以及内外斜位(MLO)投影的双期采集(早期和晚期)下接受了双能CEM检查。恶性病变的参考标准是手术标本的病理结果,良性病变的参考标准是手术标本或细针穿刺细胞学检查、粗针或Tru-Cut针活检以及真空辅助乳腺活检的病理结果。总共对80例患者的104个样本进行了分析。此外,通过手动分割感兴趣区域提取了48个纹理参数。进行了单变量和多变量分析方法:非参数Wilcoxon-Mann-Whitney检验;使用了受试者操作特征(ROC)、线性分类器(LDA)、决策树(DT)、k近邻(KNN)、人工神经网络(NNET)和支持向量机(SVM)。采用了平衡方法和特征选择方法。单变量分析显示,所有考虑的特征的准确性和曲线下面积(AUC)都较低。相反,在多变量纹理分析中,考虑CC视图时(准确率(ACC)=0.75;AUC=0.82),使用留一法交叉验证(LOOCV)和平衡数据(使用自适应合成(ADASYN)函数)训练的DT以及三个稳健纹理特征(平均绝对偏差(MAD)、方差(VARIANCE)和长行程游程长度矩阵(LRLGE))的子集达到了最佳性能。考虑早期MLO视图时(ACC=0.77;AUC=0.83),使用LOOCV和平衡数据(使用ADASYN函数)训练的NNET以及十个稳健特征(均值(MEAN)、MAD、范围(RANGE)、四分位距(IQR)、方差、相关性(CORRELATION)、游程长度百分比(RLV)、粗糙度(COARSNESS)、繁忙度(BUSYNESS)和强度(STRENGTH))的子集达到了最佳性能。考虑晚期MLO视图时(ACC=0.73;AUC=0.82),使用LOOCV和平衡数据(使用ADASYN函数)训练的NNET以及十一个稳健特征(众数(MODE)、中位数(MEDIAN)、RANGE、游程长度非零(RLN)、LRLGE、RLV、短行程游程长度矩阵(LZLGE)、灰度共生矩阵(GLV_GLSZM)、零阶矩(ZSV)、COARSNESS和BUSYNESS)的子集达到了最佳性能。使用模式识别方法进行多变量分析,考虑从所有三个乳腺摄影投影(CC、早期MLO和晚期MLO)中提取的144个纹理特征,通过自适应合成采样和特征选择操作进行优化,获得了最佳结果(ACC=0.87;AUC=0.90),并且在区分良性和恶性病变方面表现出最佳性能。