Cao Peng, Zhao Dazhe, Zaiane Osmar
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:3981-4. doi: 10.1109/EMBC.2013.6610417.
The class imbalance issue occurs when training a computer-aided detection (CAD) system for nodules. This imbalance causes poor prediction performance for true nodules. Moreover, the misclassification costs are different between two classes and high sensitivity of true nodules is essential in the detection. In order to eliminate or reduce the false positives while keeping high sensitivity, we present an effective wrapper framework incorporating the evaluation measure of imbalanced data into the objective function of cost sensitive SVM. We improve the performance of classification by simultaneously optimizing the best pair of misclassification cost parameter, feature subset and intrinsic parameters. We evaluated the method on a 3D Lung nodule dataset, showing that the proposed method outperforms many other exiting common methods, as well as specific imbalanced data learning methods, which indicates the effectiveness of our method on the imbalanced and unequal misclassification cost data classification.
在训练用于结节的计算机辅助检测(CAD)系统时会出现类别不平衡问题。这种不平衡会导致对真正结节的预测性能不佳。此外,两类之间的误分类成本不同,并且真正结节的高敏感性在检测中至关重要。为了在保持高敏感性的同时消除或减少误报,我们提出了一个有效的包装框架,将不平衡数据的评估指标纳入成本敏感支持向量机(SVM)的目标函数中。我们通过同时优化误分类成本参数、特征子集和固有参数的最佳组合来提高分类性能。我们在一个三维肺结节数据集上评估了该方法,结果表明所提出的方法优于许多其他现有的常用方法以及特定的不平衡数据学习方法,这表明我们的方法在不平衡和不等误分类成本数据分类方面是有效的。