Department of Infection, Immunity and Cardiovascular Disease, The University of Sheffield, Medical School, Beech Hill Road, Sheffield S10 2RX, U.K..
Department of Gynaecological Oncology, Royal Hallamshire Hospital, Glossop Road, Sheffield S10 2JF.
Eur J Obstet Gynecol Reprod Biol. 2019 Sep;240:182-186. doi: 10.1016/j.ejogrb.2019.07.003. Epub 2019 Jul 3.
To review the published diagnostic accuracy figures for the performance of colposcopy and to assess how the various forms of bias might explain the very wide range of reported values and the impact they have on quality assurance of cervical screening.
Publications were only selected where they contained sufficient raw data to enable diagnostic accuracy statistics to be calculated for the detection of cervical intraepithelial neoplasia grade 2+ (CIN2+), as determined by punch biopsy. In addition, both the colposcopic impression at the time of examination and the disease threshold used to determine the need for biopsy must have been reported.
Large differences in diagnostic accuracy figures were found when the output of colposcopy was defined either, on the basis that the colposcopist thought there was CIN2+ present or, that the colposcopist considered there to be some disease present and so took a biopsy to confirm this. Weighted mean sensitivity was 68.5% (95% CI 59.9-77.1) for the first method but 95.7% (95% CI 93.4-98.0) for the second method. Weighted mean specificity was 75.9% (95% CI 69.3-82.5) for the first method but 34.2% (95% CI 27.0-41.4) for the second method. Weighted mean PPV was 68.9% (95% CI 64.2-73.6) for the first method but 54.3% (95% CI 46.5-62.1) for the second method.
The main reason for the wide range of published diagnostic accuracy figures, arises from the use of two different methods of assessing the output of colposcopy. Colposcopic Impression is appropriate when assessing the performance of a colposcopist at the time of examination, but the taking of a biopsy to confirm that Disease is Present should be used when assessing patient management. Accurate assessment of both outcomes is fundamental to any quality assurance programme.
回顾阴道镜检查性能的已发表诊断准确性数据,并评估各种形式的偏倚如何解释报告值的广泛差异及其对宫颈筛查质量保证的影响。
仅选择那些包含足够原始数据的出版物,以便能够计算出用于检测宫颈上皮内瘤变 2 级及以上(CIN2+)的诊断准确性统计数据,该数据由活检确定。此外,必须报告检查时的阴道镜印象以及用于确定是否需要活检的疾病阈值。
当阴道镜检查的输出结果基于以下两种方法之一进行定义时,发现诊断准确性数据存在很大差异:1. 阴道镜医师认为存在 CIN2+;2. 阴道镜医师认为存在某种疾病,因此进行活检以证实这一点。第一种方法的加权平均敏感度为 68.5%(95%置信区间 59.9-77.1),而第二种方法的加权平均敏感度为 95.7%(95%置信区间 93.4-98.0)。第一种方法的加权平均特异性为 75.9%(95%置信区间 69.3-82.5),而第二种方法的加权平均特异性为 34.2%(95%置信区间 27.0-41.4)。第一种方法的加权平均阳性预测值为 68.9%(95%置信区间 64.2-73.6),而第二种方法的加权平均阳性预测值为 54.3%(95%置信区间 46.5-62.1)。
发表的诊断准确性数据范围广泛的主要原因是使用了两种评估阴道镜检查结果的不同方法。在评估检查时阴道镜医师的表现时,阴道镜印象是合适的,但在评估患者管理时,应使用活检来确认疾病的存在。准确评估两种结果是任何质量保证计划的基础。