Georgiou Harris, Mavroforakis Michael, Dimitropoulos Nikos, Cavouras Dionisis, Theodoridis Sergios
University of Athens, Informatics Department, TYPA Buildings, University Campus, 15771 Athens, Greece.
Artif Intell Med. 2007 Sep;41(1):39-55. doi: 10.1016/j.artmed.2007.06.004. Epub 2007 Aug 21.
A comprehensive signal analysis approach on the mammographic mass boundary morphology is presented in this article. The purpose of this study is to identify efficient sets of simple yet effective shape features, employed in the original and multi-scaled spectral representations of the boundary, for the characterization of the mammographic mass. These new methods of mass boundary representation and processing in more than one domain greatly improve the information content of the base data that is used for pattern classification purposes, introducing comprehensive spectral and multi-scale wavelet versions of the original boundary signals. The evaluation is conducted against morphological and diagnostic characterization of the mass, using statistical methods, fractal dimension analysis and a wide range of classifier architectures.
This study consists of (a) the investigation of the original radial distance measurements under the complete spectrum of signal analysis, (b) the application of curve feature extractors of morphological characteristics and the evaluation of the discriminative power of each one of them, by means of statistical significance analysis and dataset fractal dimension, and (c) the application of a wide range of classifier architectures on these morphological datasets, in order to conduct a comparative evaluation of the efficiency and effectiveness of all architectures, for mammographic mass characterization. Radial distance signal was exploited using the discrete Fourier transform (DFT) and the discrete wavelet transform (DWT) as additional carrier signals. Seven uniresolution feature functions were applied over these carrier signals and multiple shape descriptors were created. Classification was conducted against mass shape type and clinical diagnosis, using a wide range of linear and non-linear classifiers, including linear discriminant analysis (LDA), least-squares minimum distance (LSMD), k-nearest neighbor (k-NN), radial basis function (RBF) and multi-layered perceptron (MLP) neural networks (NN), and support vector machines (SVM). Fractal analysis was employed as a dataset analysis tool in the feature selection phase. The discriminative power of the features produced by this composite analysis is subsequently analyzed by means of multivariate analysis of variance (MANOVA) and tested against two distinct classification targets, namely (a) the morphological shape type of the mass and (b) the histologically verified clinical diagnosis for each mammogram.
Statistical analysis and classification results have shown that the discrimination value of the features extracted from the DWT components and especially the DFT spectrum, are of great importance. Furthermore, much of the information content of the curve features in the case of DFT and DWT datasets is directly related to the texture and fine-scale details of the corresponding envelope signal of the spectral components. Neural classifiers outperformed all other methods (SVM not used because they are mainly two-class classifiers) with overall success rate of 72.3% for shape type identification, while SVM achieved the overall highest 91.54% for clinical diagnosis. Receiver operating characteristic (ROC) analysis has been employed to present the sensitivity and specificity of the results of this study.
本文提出了一种针对乳腺钼靶肿块边界形态的综合信号分析方法。本研究的目的是识别在边界的原始和多尺度频谱表示中使用的一组高效的简单而有效的形状特征,用于表征乳腺钼靶肿块。这些在多个域中进行肿块边界表示和处理的新方法极大地提高了用于模式分类目的的基础数据的信息含量,引入了原始边界信号的综合频谱和多尺度小波版本。使用统计方法、分形维数分析和多种分类器架构对肿块的形态和诊断特征进行评估。
本研究包括:(a) 在完整的信号分析频谱下对原始径向距离测量值进行研究;(b) 应用形态特征的曲线特征提取器,并通过统计显著性分析和数据集分形维数评估每个提取器的判别能力;(c) 在这些形态学数据集上应用多种分类器架构,以便对所有架构在乳腺钼靶肿块表征方面的效率和有效性进行比较评估。利用离散傅里叶变换 (DFT) 和离散小波变换 (DWT) 作为附加载波信号来处理径向距离信号。对这些载波信号应用了七个单分辨率特征函数,并创建了多个形状描述符。使用多种线性和非线性分类器进行针对肿块形状类型和临床诊断的分类,包括线性判别分析 (LDA)、最小二乘最小距离 (LSMD)、k 近邻 (k-NN)、径向基函数 (RBF) 和多层感知器 (MLP) 神经网络 (NN),以及支持向量机 (SVM)。在特征选择阶段采用分形分析作为数据集分析工具。随后通过多变量方差分析 (MANOVA) 分析这种综合分析产生的特征的判别能力,并针对两个不同的分类目标进行测试,即 (a) 肿块的形态形状类型和 (b) 每张乳腺钼靶片经组织学验证的临床诊断。
统计分析和分类结果表明,从 DWT 分量尤其是 DFT 频谱中提取的特征的判别值非常重要。此外,在 DFT 和 DWT 数据集的情况下,曲线特征的许多信息内容与频谱分量相应包络信号的纹理和精细尺度细节直接相关。神经分类器的表现优于所有其他方法(未使用 SVM,因为它们主要是二分类器),形状类型识别的总体成功率为 72.3%,而 SVM 在临床诊断方面的总体最高成功率为 91.54%。采用了受试者工作特征 (ROC) 分析来展示本研究结果的敏感性和特异性。