Bilska-Wolak Anna O, Floyd Carey E, Lo Joseph Y, Baker Jay A
Duke Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, DUMC 2623, Durham, NC 27710, USA.
Acad Radiol. 2005 Jun;12(6):671-80. doi: 10.1016/j.acra.2005.02.011.
The purpose of this study was to validate the performance of a previously developed computer aid for breast mass classification for mammography on a new, independent database of cases not used for algorithm development.
A computer aid (classifier) based on the likelihood ratio (LRb) was previously developed on a database of 670 mass cases. The 670 cases (245 malignant) from one medical institution were described using 16 features from the American College of Radiology Breast Imaging-Reporting and Data System lexicon and patient history findings. A separate database of 151 (43 malignant) validation cases were collected that were previously unseen by the classifier. These new validation cases were evaluated by the classifier without retraining. Performance evaluation methods included Receiver Operating Characteristic (ROC), round-robin, and leave-one-out bootstrap sampling.
The performance of the classifier on the training data yielded an average ROC area of 0.90 +/- 0.02 and partial ROC area (0.90AUC) of 0.60 +/- 0.06. The exact nonparametric performance on the validation set of 151 cases yielded a ROC area of 0.88 and 0.90AUC of 0.57. Using a 100% sensitivity cutoff threshold established on the training data (100% negative predictive value), the classifier correctly identified 100% of the malignant masses in the validation test set, while potentially obviating 26% of the biopsies performed on benign masses.
The LRb classifier performed consistently on new data that was not used for classifier development. The LRb classifier shows promise as a potential aid in reducing the number of biopsies performed on benign masses.
本研究旨在在一个未用于算法开发的新的独立病例数据库上,验证先前开发的用于乳腺钼靶摄影中乳腺肿块分类的计算机辅助工具的性能。
先前在一个包含670例肿块病例的数据库上开发了一种基于似然比(LRb)的计算机辅助工具(分类器)。使用美国放射学会乳腺影像报告和数据系统词汇表中的16个特征以及患者病史发现,描述了来自一个医疗机构的670例病例(245例恶性)。收集了一个包含151例(43例恶性)验证病例的单独数据库,分类器之前未见过这些病例。这些新的验证病例在未重新训练的情况下由分类器进行评估。性能评估方法包括受试者操作特征(ROC)、循环法和留一法自助抽样。
分类器在训练数据上的性能产生的平均ROC面积为0.90±0.02,部分ROC面积(0.90AUC)为0.60±0.06。在151例病例的验证集上的确切非参数性能产生的ROC面积为0.88,0.90AUC为0.57。使用在训练数据上建立的100%敏感性截止阈值(100%阴性预测值),分类器在验证测试集中正确识别了100%的恶性肿块,同时可能避免了对良性肿块进行的26%的活检。
LRb分类器在未用于分类器开发的新数据上表现一致。LRb分类器有望作为一种潜在的辅助工具,减少对良性肿块进行活检的次数。