Mazurowski Maciej A, Zurada Jacek M, Tourassi Georgia D
Department of Radiology, Carl E. Ravin Advanced Imaging Laboratories, Duke University Medical Center, Durham, North Carolina 27705, USA.
Med Phys. 2009 Jul;36(7):2976-84. doi: 10.1118/1.3132304.
Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC = 0.905 +/- 0.024) in performance as compared to the original IT-CAD system (AUC = 0.865 +/- 0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters.
集成分类器已在多个应用中显示出高效性。在本文中,作者探讨了集成分类器在基于案例的计算机辅助诊断系统中检测乳腺X线照片中肿块的有效性。他们评估了通过对可用开发数据集进行重采样来构建子分类器的两种一般方法:随机划分和随机选择。此外,他们讨论了选择集成规模的问题,并提出了两种自适应增量技术,可自动为手头的问题选择规模。所有技术均相对于先前提出的信息论计算机辅助诊断系统(IT-CAD)进行评估。实验结果表明,与原始IT-CAD系统(AUC = 0.865 +/- 0.029)相比,所研究的集成技术在性能上有统计学上的显著提高(AUC = 0.905 +/- 0.024)。一些技术可显著减少存储在案例库中的示例总数(降至原始大小的1.3%),这反过来又降低了存储需求并缩短了系统响应时间。在本文研究的方法中,所提出的两种自适应技术迄今为止在此目的上最为有效。此外,作者还提供了一些关于选择集成参数的讨论和指导。