Sahiner B, Chan H P, Petrick N, Wagner R F, Hadjiiski L
Department of Radiology, University of Michigan, Ann Arbor 48109-0904, USA.
Med Phys. 2000 Jul;27(7):1509-22. doi: 10.1118/1.599017.
In computer-aided diagnosis (CAD), a frequently used approach for distinguishing normal and abnormal cases is first to extract potentially useful features for the classification task. Effective features are then selected from this entire pool of available features. Finally, a classifier is designed using the selected features. In this study, we investigated the effect of finite sample size on classification accuracy when classifier design involves stepwise feature selection in linear discriminant analysis, which is the most commonly used feature selection algorithm for linear classifiers. The feature selection and the classifier coefficient estimation steps were considered to be cascading stages in the classifier design process. We compared the performance of the classifier when feature selection was performed on the design samples alone and on the entire set of available samples, which consisted of design and test samples. The area Az under the receiver operating characteristic curve was used as our performance measure. After linear classifier coefficient estimation using the design samples, we studied the hold-out and resubstitution performance estimates. The two classes were assumed to have multidimensional Gaussian distributions, with a large number of features available for feature selection. We investigated the dependence of feature selection performance on the covariance matrices and means for the two classes, and examined the effects of sample size, number of available features, and parameters of stepwise feature selection on classifier bias. Our results indicated that the resubstitution estimate was always optimistically biased, except in cases where the parameters of stepwise feature selection were chosen such that too few features were selected by the stepwise procedure. When feature selection was performed using only the design samples, the hold-out estimate was always pessimistically biased. When feature selection was performed using the entire finite sample space, the hold-out estimates could be pessimistically or optimistically biased, depending on the number of features available for selection, the number of available samples, and their statistical distribution. For our simulation conditions, these estimates were always pessimistically (conservatively) biased if the ratio of the total number of available samples per class to the number of available features was greater than five.
在计算机辅助诊断(CAD)中,区分正常和异常病例的一种常用方法是首先为分类任务提取潜在有用的特征。然后从这一整套可用特征中选择有效特征。最后,使用所选特征设计一个分类器。在本研究中,我们调查了在分类器设计涉及线性判别分析中的逐步特征选择时有限样本量对分类准确率的影响,线性判别分析是线性分类器最常用的特征选择算法。特征选择和分类器系数估计步骤被视为分类器设计过程中的级联阶段。我们比较了仅在设计样本上以及在由设计样本和测试样本组成的整套可用样本上进行特征选择时分类器的性能。接收器操作特性曲线下的面积Az用作我们的性能度量。在使用设计样本估计线性分类器系数后,我们研究了留出法和再代入性能估计。假设两类具有多维高斯分布,有大量特征可用于特征选择。我们研究了特征选择性能对两类协方差矩阵和均值的依赖性,并检查了样本量、可用特征数量和逐步特征选择参数对分类器偏差的影响。我们的结果表明,再代入估计总是存在乐观偏差,除非逐步特征选择的参数选择使得逐步过程选择的特征过少。当仅使用设计样本进行特征选择时,留出法估计总是存在悲观偏差。当使用整个有限样本空间进行特征选择时,留出法估计可能存在悲观或乐观偏差,这取决于可供选择的特征数量、可用样本数量及其统计分布。对于我们的模拟条件,如果每类可用样本总数与可用特征数量之比大于5,这些估计总是存在悲观(保守)偏差。