Kim Sun Mi, Kim Yongdai, Jeong Kuhwan, Jeong Heeyeong, Kim Jiyoung
Department of Radiology, Seoul National University Bundang Hospital, Seoul National University, Seongnam, Korea.
Department of Statistics, Seoul National University, Seoul, Korea.
Ultrasonography. 2018 Jan;37(1):36-42. doi: 10.14366/usg.16045. Epub 2017 Apr 14.
The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer.
This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests.
Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141).
Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.
本研究旨在比较使用两种不同回归模型进行乳腺癌预测的图像分析性能,并评估将临床和人口统计学数据(CDD)纳入图像分析以改善乳腺癌诊断的实用性。
本研究纳入了2009年6月至2010年4月期间接受超声引导下粗针活检且有可用CDD的139例患者的139个实性肿块。三位乳腺放射科医生回顾性地评估了139个乳腺肿块,并使用乳腺影像报告和数据系统(BI-RADS)术语对每个病变进行描述。我们应用并比较了两种回归方法——逐步逻辑(SL)回归和逻辑最小绝对收缩和选择算子(LASSO)回归——其中将BI-RADS描述符和CDD用作协变量。我们根据测试误分类误差和测试曲线下面积(AUC)研究了这些回归方法的性能以及放射科医生之间的一致性。
无论协变量中是否包含CDD,逻辑LASSO回归在测试误分类误差(无CDD时为0.234对0.253;有CDD时为0.196对0.258)和AUC(无CDD时为0.785对0.759;有CDD时为0.873对0.735)方面均优于SL回归(P<0.05)。然而,在测试误分类误差(无CDD时为0.234对0.168;有CDD时为0.196对0.088)和无CDD时的AUC(0.785对0.844,P<0.001)方面,它不如三位放射科医生之间的一致性,但与有CDD时的AUC相当(0.873对0.880,P=0.141)。
基于BI-RADS描述符和CDD的逻辑LASSO回归在预测乳腺癌存在方面表现优于SL回归。将CDD作为BI-RADS描述符的补充,显著改善了使用逻辑LASSO回归对乳腺癌的预测。