Department of Obstetrics and Gynecology, Tenon APHP University Hospital, 75020 Paris, France.
Eur J Cancer. 2012 Jan;48(1):30-6. doi: 10.1016/j.ejca.2011.08.011. Epub 2011 Nov 17.
Ko's scoring system was developed to predict malignancy upgrades in patients diagnosed with atypical ductal hyperplasia by core needle biopsy. The Ko algorithm was able to identify a subset of patients who were eligible for exclusively clinical follow-up. The current study statistically investigated the patient outcomes to determine whether this scoring system could be translated and used safely in clinical practice.
We tested the statistical performance of the Ko scoring system against an external independent multicentre population. One hundred and seven cases of atypical ductal hyperplasia diagnosed by an 11-gauge biopsy needle were available for inclusion in this study. The discrimination, calibration and clinical utility of the scoring system were quantified. In addition, we tested the underestimation rate, sensitivity, specificity, and positive and negative predictive values according to the score threshold.
The overall underestimation rate was 19% (20/107). The area under the receiver operating characteristic curve for the logistic regression model was 0.51 (95% confidence interval: 0.47-0.53). The model was not well calibrated. The lowest predicted underestimation rate was 11%. The sensitivity, specificity, positive predictive value, and negative predictive values were 90%, 22%, 20%, and 89%, respectively, according to the most accurate threshold proposed in the original study.
The scoring system was not sufficiently accurate to safely define a subset of patients who would be eligible for follow-up only and no additional treatment. These results demonstrate a lack of reproducibility in an external population. A multidisciplinary approach that correlates clinicopathological and mammographic features should be recommended for the management of these patients.
Ko 评分系统旨在通过核心针活检预测诊断为非典型导管增生的患者的恶性升级。Ko 算法能够识别出一部分适合仅临床随访的患者。本研究从统计学角度对患者的预后进行了研究,以确定该评分系统是否可以安全地转化并应用于临床实践。
我们通过外部独立多中心人群测试了 Ko 评分系统的统计性能。本研究共纳入 107 例 11 号活检针诊断的非典型导管增生病例。对评分系统的区分度、校准度和临床实用性进行了量化。此外,我们还根据评分阈值测试了低估率、敏感性、特异性、阳性预测值和阴性预测值。
总体低估率为 19%(20/107)。逻辑回归模型的受试者工作特征曲线下面积为 0.51(95%置信区间:0.47-0.53)。该模型校准效果不佳。预测低估率最低为 11%。根据原始研究中提出的最准确的阈值,灵敏度、特异性、阳性预测值和阴性预测值分别为 90%、22%、20%和 89%。
该评分系统不够准确,无法安全地确定一部分适合仅进行随访而无需额外治疗的患者。这些结果表明,在外部人群中缺乏可重复性。对于这些患者,应该推荐采用多学科方法,结合临床病理和乳腺 X 线特征进行管理。