Anja van't Hoog, Health Research & Training Consultancy, Utrecht, Netherlands.
Department of Global Public Health, Karolinska Institutet, Stockholm, Sweden.
Cochrane Database Syst Rev. 2022 Mar 23;3(3):CD010890. doi: 10.1002/14651858.CD010890.pub2.
Systematic screening in high-burden settings is recommended as a strategy for early detection of pulmonary tuberculosis disease, reducing mortality, morbidity and transmission, and improving equity in access to care. Questioning for symptoms and chest radiography (CXR) have historically been the most widely available tools to screen for tuberculosis disease. Their accuracy is important for the design of tuberculosis screening programmes and determines, in combination with the accuracy of confirmatory diagnostic tests, the yield of a screening programme and the burden on individuals and the health service.
To assess the sensitivity and specificity of questioning for the presence of one or more tuberculosis symptoms or symptom combinations, CXR, and combinations of these as screening tools for detecting bacteriologically confirmed pulmonary tuberculosis disease in HIV-negative adults and adults with unknown HIV status who are considered eligible for systematic screening for tuberculosis disease. Second, to investigate sources of heterogeneity, especially in relation to regional, epidemiological, and demographic characteristics of the study populations.
We searched the MEDLINE, Embase, LILACS, and HTA (Health Technology Assessment) databases using pre-specified search terms and consulted experts for unpublished reports, for the period 1992 to 2018. The search date was 10 December 2018. This search was repeated on 2 July 2021.
Studies were eligible if participants were screened for tuberculosis disease using symptom questions, or abnormalities on CXR, or both, and were offered confirmatory testing with a reference standard. We included studies if diagnostic two-by-two tables could be generated for one or more index tests, even if not all participants were subjected to a microbacteriological reference standard. We excluded studies evaluating self-reporting of symptoms.
We categorized symptom and CXR index tests according to commonly used definitions. We assessed the methodological quality of included studies using the QUADAS-2 instrument. We examined the forest plots and receiver operating characteristic plots visually for heterogeneity. We estimated summary sensitivities and specificities (and 95% confidence intervals (CI)) for each index test using bivariate random-effects methods. We analyzed potential sources of heterogeneity in a hierarchical mixed-model.
The electronic database search identified 9473 titles and abstracts. Through expert consultation, we identified 31 reports on national tuberculosis prevalence surveys as eligible (of which eight were already captured in the search of the electronic databases), and we identified 957 potentially relevant articles through reference checking. After removal of duplicates, we assessed 10,415 titles and abstracts, of which we identified 430 (4%) for full text review, whereafter we excluded 364 articles. In total, 66 articles provided data on 59 studies. We assessed the 2 July 2021 search results; seven studies were potentially eligible but would make no material difference to the review findings or grading of the evidence, and were not added in this edition of the review. We judged most studies at high risk of bias in one or more domains, most commonly because of incorporation bias and verification bias. We judged applicability concerns low in more than 80% of studies in all three domains. The three most common symptom index tests, cough for two or more weeks (41 studies), any cough (21 studies), and any tuberculosis symptom (29 studies), showed a summary sensitivity of 42.1% (95% CI 36.6% to 47.7%), 51.3% (95% CI 42.8% to 59.7%), and 70.6% (95% CI 61.7% to 78.2%, all very low-certainty evidence), and a specificity of 94.4% (95% CI 92.6% to 95.8%, high-certainty evidence), 87.6% (95% CI 81.6% to 91.8%, low-certainty evidence), and 65.1% (95% CI 53.3% to 75.4%, low-certainty evidence), respectively. The data on symptom index tests were more heterogenous than those for CXR. The studies on any tuberculosis symptom were the most heterogeneous, but had the lowest number of variables explaining this variation. Symptom index tests also showed regional variation. The summary sensitivity of any CXR abnormality (23 studies) was 94.7% (95% CI 92.2% to 96.4%, very low-certainty evidence) and 84.8% (95% CI 76.7% to 90.4%, low-certainty evidence) for CXR abnormalities suggestive of tuberculosis (19 studies), and specificity was 89.1% (95% CI 85.6% to 91.8%, low-certainty evidence) and 95.6% (95% CI 92.6% to 97.4%, high-certainty evidence), respectively. Sensitivity was more heterogenous than specificity, and could be explained by regional variation. The addition of cough for two or more weeks, whether to any (pulmonary) CXR abnormality or to CXR abnormalities suggestive of tuberculosis, resulted in a summary sensitivity and specificity of 99.2% (95% CI 96.8% to 99.8%) and 84.9% (95% CI 81.2% to 88.1%) (15 studies; certainty of evidence not assessed).
AUTHORS' CONCLUSIONS: The summary estimates of the symptom and CXR index tests may inform the choice of screening and diagnostic algorithms in any given setting or country where screening for tuberculosis is being implemented. The high sensitivity of CXR index tests, with or without symptom questions in parallel, suggests a high yield of persons with tuberculosis disease. However, additional considerations will determine the design of screening and diagnostic algorithms, such as the availability and accessibility of CXR facilities or the resources to fund them, and the need for more or fewer diagnostic tests to confirm the diagnosis (depending on screening test specificity), which also has resource implications. These review findings should be interpreted with caution due to methodological limitations in the included studies and regional variation in sensitivity and specificity. The sensitivity and specificity of an index test in a specific setting cannot be predicted with great precision due to heterogeneity. This should be borne in mind when planning for and implementing tuberculosis screening programmes.
系统筛查在高负担环境中被推荐作为早期发现肺结核病的一种策略,可以降低死亡率、发病率和传播率,并改善获得医疗服务的公平性。询问症状和胸部 X 线摄影(CXR)一直是最广泛可用的工具,用于筛查结核病。它们的准确性对于结核病筛查计划的设计很重要,并结合确认诊断测试的准确性,确定了筛查计划的产量以及对个人和卫生服务的负担。
评估询问一种或多种肺结核症状或症状组合、CXR 以及这些组合作为工具在 HIV 阴性成年人和 HIV 状况未知的成年人中检测细菌学确诊的肺结核病的敏感性和特异性,这些人被认为有资格接受系统的结核病筛查。其次,调查异质性的来源,特别是与研究人群的区域、流行病学和人口统计学特征有关。
我们使用预定义的搜索词在 MEDLINE、Embase、LILACS 和 HTA(卫生技术评估)数据库中进行了检索,并咨询了专家以获取未发表的报告,检索时间为 1992 年至 2018 年。2021 年 7 月 2 日再次进行了检索。
如果参与者使用症状问题、CXR 异常或两者都进行结核病筛查,并提供参考标准的确认性检测,则研究符合入选标准。如果可以生成一个或多个指标测试的诊断性四格表,即使并非所有参与者都接受了微观细菌学参考标准,我们也将包括这些研究。我们排除了评估自我报告症状的研究。
我们根据常用定义对症状和 CXR 指标测试进行了分类。我们使用 QUADAS-2 工具评估了纳入研究的方法学质量。我们通过观察森林图和受试者工作特征图来检查异质性。我们使用双变量随机效应方法估计了每个指标测试的汇总敏感性和特异性(和 95%置信区间(CI))。我们在分层混合模型中分析了异质性的潜在来源。
电子数据库搜索确定了 9473 个标题和摘要。通过专家咨询,我们确定了 31 份全国结核病患病率调查的报告为合格(其中 8 份已经包含在电子数据库搜索中),我们通过参考文献检查确定了 957 篇潜在相关文章。在去除重复项后,我们评估了 10415 个标题和摘要,其中 430 个(4%)进行了全文审查,此后我们排除了 364 篇文章。共有 66 篇文章提供了 59 项研究的数据。我们评估了 2021 年 7 月 2 日的搜索结果;有 7 项研究可能符合条件,但不会对审查结果或证据的分级产生实质性影响,因此未在此版本的审查中添加。我们认为大多数研究在一个或多个领域存在高偏倚风险,最常见的原因是合并偏倚和验证偏倚。在所有三个领域中,我们都认为超过 80%的研究存在低应用问题。三个最常见的症状指标测试,咳嗽持续两周或以上(41 项研究)、任何咳嗽(21 项研究)和任何结核病症状(29 项研究),其汇总敏感性为 42.1%(95%CI 36.6%至 47.7%)、51.3%(95%CI 42.8%至 59.7%)和 70.6%(95%CI 61.7%至 78.2%),特异性为 94.4%(95%CI 92.6%至 95.8%)、87.6%(95%CI 81.6%至 91.8%)和 65.1%(95%CI 53.3%至 75.4%),均为低确定性证据)。症状指标测试的数据比 CXR 的数据更具异质性。任何结核病症状的研究最具异质性,但导致这种变异的变量数量最少。症状指标测试也显示出区域差异。任何 CXR 异常(23 项研究)的汇总敏感性为 94.7%(95%CI 92.2%至 96.4%,非常低确定性证据)和 CXR 异常提示结核病(19 项研究)的 84.8%(95%CI 76.7%至 90.4%,低确定性证据),特异性为 89.1%(95%CI 85.6%至 91.8%,低确定性证据)和 95.6%(95%CI 92.6%至 97.4%,高确定性证据)。敏感性比特异性更具异质性,并且可以通过区域差异来解释。添加持续咳嗽两周或以上,无论是对任何(肺部)CXR 异常还是对提示结核病的 CXR 异常,都将产生 99.2%(95%CI 96.8%至 99.8%)和 84.9%(95%CI 81.2%至 88.1%)的汇总敏感性和特异性(15 项研究;证据确定性未评估)。
症状和 CXR 指标测试的汇总估计可能为正在实施结核病筛查的任何特定环境或国家提供筛查和诊断算法的选择依据。CXR 指标测试的高敏感性,无论是否并行出现症状问题,都表明结核病患者的检出率很高。然而,其他考虑因素将决定筛查和诊断算法的设计,例如 CXR 设施的可用性和可及性或为其提供资金的资源,以及为了确认诊断(取决于筛查测试的特异性)而需要进行更多或更少的诊断测试的需求,这也会带来资源方面的影响。由于纳入研究中的方法学局限性和敏感性和特异性的区域差异,应该谨慎解释这些审查结果。由于存在异质性,特定环境中指标测试的敏感性和特异性不能非常精确地预测。在规划和实施结核病筛查计划时应牢记这一点。