Samala Ravi, Moreno Wilfrido, You Yuncheng, Qian Wei
College of Engineering, University of South Florida, Tampa, FL, USA.
Acad Radiol. 2009 Apr;16(4):418-27. doi: 10.1016/j.acra.2008.10.009.
An analysis for the optimum selection of image features in feature domain to represent lung nodules was performed, with implementation into a classification module of a computer-aided diagnosis system.
Forty-two regions of interest obtained from 38 cases with effective diameters of 3 to 8.5 mm were used. On the basis of image characteristics and dimensionality, 11 features were computed. Nonparametric correlation coefficients, multiple regression analysis, and principal-component analysis were used to map the relation between the represented features from four radiologists and the computed features. An artificial neural network was used for the classification of benign and malignant nodules to test the hypothesis obtained from the mapping analysis.
Correlation coefficients ranging from 0.2693 to 0.5178 were obtained between the radiologists' annotations and the computed features. Of the 11 features used, three were found to be redundant when both nodule and non-nodule cases were used, and five were found redundant when nodule or non-nodule cases were used. Combination of analysis from correlation coefficients, regression analysis, principal-component analysis, and the artificial neural network resulted in the selection of optimum features to achieve F-test values of 0.821 and 0.643 for malignant and benign nodules, respectively.
This study demonstrates that for the optimum selection of features, each feature should be analyzed individually and collectively to evaluate the impact on the computer-aided diagnosis system on the basis of its class representation. This methodology will ultimately aid in improving the generalization capability of a classification module for early lung cancer diagnosis.
对特征域中用于表示肺结节的图像特征进行了优化选择分析,并将其应用于计算机辅助诊断系统的分类模块。
使用了从38例有效直径为3至8.5毫米的病例中获取的42个感兴趣区域。根据图像特征和维度,计算了11个特征。使用非参数相关系数、多元回归分析和主成分分析来绘制四位放射科医生所表示特征与计算特征之间的关系。使用人工神经网络对良性和恶性结节进行分类,以检验从映射分析中获得的假设。
放射科医生的标注与计算特征之间的相关系数在0.2693至0.5178之间。在所使用的11个特征中,当同时使用结节和非结节病例时,发现有3个特征是冗余的;当使用结节或非结节病例时,发现有5个特征是冗余的。通过相关系数分析、回归分析、主成分分析和人工神经网络的组合,选择了最佳特征,恶性和良性结节的F检验值分别达到0.821和0.643。
本研究表明,为了优化特征选择,应分别和综合分析每个特征,以评估其在类别表示基础上对计算机辅助诊断系统的影响。这种方法最终将有助于提高早期肺癌诊断分类模块的泛化能力。