Gundreddy Rohith Reddy, Tan Maxine, Qiu Yuchen, Cheng Samuel, Liu Hong, Zheng Bin
School of Electrical and Computer Engineering, University of Oklahoma, Norman, Oklahoma 73019.
Med Phys. 2015 Jul;42(7):4241-9. doi: 10.1118/1.4922681.
To develop a new computer-aided diagnosis (CAD) scheme using a content-based image retrieval (CBIR) approach for classification between the malignant and benign breast lesions depicted on the digital mammograms and assess CAD performance and reproducibility.
An image dataset including 820 regions of interest (ROIs) was used. Among them, 431 ROIs depict malignant lesions and 389 depict benign lesions. After applying an image preprocessing process to define the lesion center, two image features were computed from each ROI. The first feature is an average pixel value of a mapped region generated using a watershed algorithm. The second feature is an average pixel value difference between a ROI's center region and the rest of the image. A two-step CBIR approach uses these two features sequentially to search for ten most similar reference ROIs for each queried ROI. A similarity based classification score was then computed to predict the likelihood of the queried ROI depicting a malignant lesion. To assess the reproducibility of the CAD scheme, we selected another independent testing dataset of 100 ROIs. For each ROI in the testing dataset, we added four randomly queried lesion center pixels and examined the variation of the classification scores.
The area under the ROC curve (AUC) = 0.962 ± 0.006 was obtained when applying a leave-one-out validation method to 820 ROIs. Using the independent testing dataset, the initial AUC value was 0.832 ± 0.040, and using the median classification score of each ROI with five queried seeds, AUC value increased to 0.878 ± 0.035.
The authors demonstrated that (1) a simple and efficient CBIR scheme using two lesion density distribution related features achieved high performance in classifying breast lesions without actual lesion segmentation and (2) similar to the conventional CAD schemes using global optimization approaches, improving reproducibility is also one of the challenges in developing CAD schemes using a CBIR based regional optimization approach.
开发一种新的计算机辅助诊断(CAD)方案,该方案使用基于内容的图像检索(CBIR)方法对数字化乳腺钼靶图像上的恶性和良性乳腺病变进行分类,并评估CAD的性能和可重复性。
使用了一个包含820个感兴趣区域(ROI)的图像数据集。其中,431个ROI描绘了恶性病变,389个描绘了良性病变。在应用图像预处理过程以定义病变中心后,从每个ROI计算两个图像特征。第一个特征是使用分水岭算法生成的映射区域的平均像素值。第二个特征是ROI中心区域与图像其余部分之间的平均像素值差异。一种两步CBIR方法依次使用这两个特征为每个查询的ROI搜索十个最相似的参考ROI。然后计算基于相似性的分类分数,以预测查询的ROI描绘恶性病变的可能性。为了评估CAD方案的可重复性,我们选择了另一个包含100个ROI的独立测试数据集。对于测试数据集中的每个ROI,我们添加了四个随机查询的病变中心像素,并检查了分类分数的变化。
当对820个ROI应用留一法验证方法时,获得的ROC曲线下面积(AUC)= 0.962 ± 0.006。使用独立测试数据集,初始AUC值为 0.832 ± 0.040,并且使用每个ROI与五个查询种子的中位数分类分数时,AUC值增加到0.878 ± 0.035。
作者证明了(1)一种使用两个与病变密度分布相关的特征的简单高效的CBIR方案在不对实际病变进行分割的情况下对乳腺病变进行分类时具有高性能,并且(2)与使用全局优化方法的传统CAD方案类似,提高可重复性也是使用基于CBIR的区域优化方法开发CAD方案的挑战之一。