Valderrabano Pablo, Hallanger-Johnson Julie E, Thapa Ram, Wang Xuefeng, McIver Bryan
Department of Head and Neck-Endocrine Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida.
Department of Endocrinology and Nutrition, Hospital Universitario Ramón y Cajal, Madrid, Spain.
JAMA Otolaryngol Head Neck Surg. 2019 Sep 1;145(9):783-792. doi: 10.1001/jamaoto.2019.1449.
In the United States, the most used molecular test for the evaluation of cytologically indeterminate thyroid nodules is the Afirma gene expression classifier (GEC).
To evaluate the GEC's diagnostic performance through a novel approach to assess whether the findings of the initial validation study are consistent with the results of postmarketing studies.
PubMed was systematically searched from inception through October 26, 2017, using the terms gene expression classifier or Afirma or GEC and thyroid.
Studies included were those in which the GEC diagnostic performance could be calculated on consecutively resected cytologically indeterminate thyroid nodules.
Two observers independently assessed study eligibility and risk of bias using the quality assessment tool for observational cohort and cross-sectional studies of the National Heart, Lung, and Blood Institute. Summary data were extracted by a reviewer and reviewed independently by another. Study authors were contacted if missing data were needed. Data were pooled using a random-effects model. PRISMA and MOOSE guidelines were followed.
Evaluation of the linear correlation between the benign call rate (BCR) and the positive predictive value (PPV).
Of the 137 retrieved titles, 19 (13.9%) were included, comprising a total of 2568 thyroid nodules. Based on a simulation using the sensitivity and specificity reported in the initial validation study, the observed BCR and PPV values in postmarketing studies would have to be explained by different underlying prevalence rates of cancer (15% vs 30%), which is an impossible event. Furthermore, the overall correlation between BCR and PPV for independent studies fell outside the PPV 95% CI of the initial validation study (95% CI, 0.17-0.32) at the BCR of pooled independent studies (0.45) and was just at the limit of the BCR 95% CI of the initial validation study (95% CI, 0.32-0.45) at the PPV of pooled independent studies (0.45). The diagnostic performance was statistically significantly better for atypia or follicular lesions of undetermined significance (diagnostic odds ratio [DOR], 5.67; 95% CI, 4.23-7.60) compared with follicular neoplasms (DOR, 2.24; 95% CI, 1.45-3.47).
The findings suggest that the initial validation study cohort was not representative of the populations in whom the GEC has been used, calling into question its reported diagnostic performance, including its negative predictive value.
在美国,用于评估细胞学检查结果不确定的甲状腺结节的最常用分子检测方法是Afirma基因表达分类器(GEC)。
通过一种新方法评估GEC的诊断性能,以判断初始验证研究的结果是否与上市后研究的结果一致。
从数据库创建至2017年10月26日,对PubMed进行系统检索,并使用了“基因表达分类器”或“Afirma”或“GEC”以及“甲状腺”等检索词。
纳入的研究是那些能够根据连续切除的细胞学检查结果不确定的甲状腺结节计算GEC诊断性能的研究。
两名观察者使用美国国立心肺血液研究所的观察性队列研究和横断面研究质量评估工具,独立评估研究的纳入资格和偏倚风险。由一名审阅者提取汇总数据,并由另一名审阅者独立审核。若需要缺失数据,则与研究作者联系。使用随机效应模型汇总数据。遵循PRISMA和MOOSE指南。
评估良性诊断率(BCR)与阳性预测值(PPV)之间的线性相关性。
在检索到的137篇标题中,纳入了19篇(13.9%),共包含2568个甲状腺结节。根据使用初始验证研究中报告的敏感性和特异性进行的模拟,上市后研究中观察到的BCR和PPV值只能通过不同的癌症潜在患病率(15%对30%)来解释,而这是不可能发生的情况。此外,在汇总独立研究的BCR(0.45)时,独立研究中BCR与PPV之间的总体相关性超出了初始验证研究的PPV 95%置信区间(95%CI,0.17 - 0.32);在汇总独立研究的PPV(0.45)时,该相关性刚好处于初始验证研究的BCR 95%置信区间(95%CI,0.32 - 0.45)的极限位置。与滤泡性肿瘤(诊断比值比[DOR],2.24;95%CI,1.45 - 3.47)相比,非典型性或意义未明的滤泡性病变的诊断性能在统计学上显著更好(DOR,5.67;95%CI,4.23 - 7.60)。
研究结果表明,初始验证研究队列不能代表使用GEC的人群,这使其报告的诊断性能受到质疑,包括其阴性预测值。