Institute for Technology Assessment, Massachusetts General Hospital, Boston, Massachusetts, USA.
Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA.
Thyroid. 2022 Oct;32(10):1144-1157. doi: 10.1089/thy.2022.0269. Epub 2022 Sep 26.
Molecular tests for thyroid nodules with indeterminate fine needle aspiration results are increasingly used in clinical practice; however, true diagnostic summaries of these tests are unknown. A systematic review and meta-analysis were completed to (1) evaluate the accuracy of commercially available molecular tests for malignancy in indeterminate thyroid nodules and (2) quantify biases and limitations in studies that validate those tests. PubMed, EMBASE, and Web of Science were systematically searched through July 2021. English language articles that reported original clinical validation attempts of molecular tests for indeterminate thyroid nodules were included if they reported counts of true-negative, true-positive, false-negative, and false-positive results. We performed screening and full-text review, followed by assessment of eight common biases and limitations, extraction of diagnostic and histopathological information, and meta-analysis of clinical validity using a bivariate linear mixed-effects model. Forty-nine studies were included. Meta-analysis of Afirma Gene expression classifiers (GEC; = 38 studies) revealed a sensitivity of 0.92 (confidence interval: 0.90-0.94), specificity of 0.26 (0.20-0.32), negative likelihood ratio (LR-) of 0.32 (0.23-0.44), positive LR+ of 1.24 (1.15-1.35), and area under the curve (AUC) of 0.83 (0.74-0.89). Afirma Genomic Sequencing Classifier (GSC; = 10) had a sensitivity of 0.94 (0.89-0.96), specificity of 0.38 (0.27-0.50), LR- of 0.18 (0.10-0.30), LR+ of 1.52 (1.28-1.87), and AUC of 0.91 (0.62-0.92). ThyroSeq v1 and v2 ( = 10) had a sensitivity of 0.86 (0.82-0.90), specificity of 0.74 (0.59-0.85), LR- of 0.19 (0.13-0.26), LR+ of 3.52 (2.08-5.92), and AUC of 0.86 (0.81-0.90). ThyroSeq v3 ( = 6) had a sensitivity of 0.92 (0.86-0.95), specificity of 0.41 (0.18-0.69), LR- of 0.24 (0.09-0.62), LR+ of 1.67 (1.09-2.98), and AUC of 0.90 (0.63-0.92). Fourteen percent of studies conducted a blinded histopathologic review of excised thyroid nodules, and 8% made the decision to go to surgery blind to molecular test results. Meta-analyses reveal a high diagnostic accuracy of molecular tests for thyroid nodule assessment of malignancy risk; however, these studies are subject to several limitations. Limitations and their potential clinical impacts must be addressed and, when feasible, adjusted for using valid statistical methodologies.
分子检测用于甲状腺结节伴有不确定的细针抽吸结果在临床实践中越来越常用;然而,这些检测的真正诊断总结并不清楚。进行了系统评价和荟萃分析,以(1)评估商业上可用的用于不确定甲状腺结节的恶性肿瘤的分子检测的准确性,(2)量化验证这些检测的研究中的偏差和局限性。通过 2021 年 7 月系统地搜索了 PubMed、EMBASE 和 Web of Science。如果报告了分子检测用于不确定甲状腺结节的原始临床验证尝试的真实阴性、真实阳性、假阴性和假阳性结果的英文文章,则包括在内。我们进行了筛选和全文审查,随后评估了 8 种常见的偏倚和局限性、诊断和组织病理学信息的提取,以及使用双变量线性混合效应模型进行临床有效性的荟萃分析。共纳入 49 项研究。对 Afirma 基因表达分类器(GEC;=38 项研究)的荟萃分析显示,敏感性为 0.92(置信区间:0.90-0.94),特异性为 0.26(0.20-0.32),阴性似然比(LR-)为 0.32(0.23-0.44),阳性似然比(LR+)为 1.24(1.15-1.35),曲线下面积(AUC)为 0.83(0.74-0.89)。Afirma 基因组测序分类器(GSC;=10 项)的敏感性为 0.94(0.89-0.96),特异性为 0.38(0.27-0.50),LR-为 0.18(0.10-0.30),LR+为 1.52(1.28-1.87),AUC 为 0.91(0.62-0.92)。ThyroSeq v1 和 v2(=10 项)的敏感性为 0.86(0.82-0.90),特异性为 0.74(0.59-0.85),LR-为 0.19(0.13-0.26),LR+为 3.52(2.08-5.92),AUC 为 0.86(0.81-0.90)。ThyroSeq v3(=6 项)的敏感性为 0.92(0.86-0.95),特异性为 0.41(0.18-0.69),LR-为 0.24(0.09-0.62),LR+为 1.67(1.09-2.98),AUC 为 0.90(0.63-0.92)。14%的研究对切除的甲状腺结节进行了盲法组织病理学审查,8%的研究在对分子检测结果进行盲法手术决定。荟萃分析显示,分子检测用于评估甲状腺结节恶性风险的诊断准确性很高;然而,这些研究存在一些局限性。必须解决这些局限性及其潜在的临床影响,并在可行的情况下使用有效的统计方法进行调整。