Davenport Clare, Arevalo-Rodriguez Ingrid, Mateos-Haro Miriam, Berhane Sarah, Dinnes Jacqueline, Spijker René, Buitrago-Garcia Diana, Ciapponi Agustín, Takwoingi Yemisi, Deeks Jonathan J, Emperador Devy, Leeflang Mariska M G, Van den Bruel Ann
Department of Applied Health Science, School of Health Sciences, University of Birmingham, Birmingham, UK.
NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK.
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
Sample collection is a key driver of accuracy in the diagnosis of SARS-CoV-2 infection. Viral load may vary at different anatomical sampling sites and accuracy may be compromised by difficulties obtaining specimens and the expertise of the person taking the sample. It is important to optimise sampling accuracy within cost, safety and accessibility constraints.
To compare the sensitivity of different sampling collection sites and methods for the detection of current SARS-CoV-2 infection with any molecular or antigen-based test.
Electronic searches of the Cochrane COVID-19 Study Register and the COVID-19 Living Evidence Database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) were undertaken on 22 February 2022. We included independent evaluations from national reference laboratories, FIND and the Diagnostics Global Health website. We did not apply language restrictions.
We included studies of symptomatic or asymptomatic people with suspected SARS-CoV-2 infection undergoing testing. We included studies of any design that compared results from different sample types (anatomical location, operator, collection device) collected from the same participant within a 24-hour period.
Within a sample pair, we defined a reference sample and an index sample collected from the same participant within the same clinical encounter (within 24 hours). Where the sample comparison was different anatomical sites, the reference standard was defined as a nasopharyngeal or combined naso/oropharyngeal sample collected into the same sample container and the index sample as the alternative anatomical site. Where the sample comparison was concerned with differences in the sample collection method from the same site, we defined the reference sample as that closest to standard practice for that sample type. Where the sample pair comparison was concerned with differences in personnel collecting the sample, the more skilled or experienced operator was considered the reference sample. Two review authors independently assessed the risk of bias and applicability concerns using the QUADAS-2 and QUADAS-C checklists, tailored to this review. We present estimates of the difference in the sensitivity (reference sample (%) minus index sample sensitivity (%)) in a pair and as an average across studies for each index sampling method using forest plots and tables. We examined heterogeneity between studies according to population (age, symptom status) and index sample (time post-symptom onset, operator expertise, use of transport medium) characteristics.
This review includes 106 studies reporting 154 evaluations and 60,523 sample pair comparisons, of which 11,045 had SARS-CoV-2 infection. Ninety evaluations were of saliva samples, 37 nasal, seven oropharyngeal, six gargle, six oral and four combined nasal/oropharyngeal samples. Four evaluations were of the effect of operator expertise on the accuracy of three different sample types. The majority of included evaluations (146) used molecular tests, of which 140 used RT-PCR (reverse transcription polymerase chain reaction). Eight evaluations were of nasal samples used with Ag-RDTs (rapid antigen tests). The majority of studies were conducted in Europe (35/106, 33%) or the USA (27%) and conducted in dedicated COVID-19 testing clinics or in ambulatory hospital settings (53%). Targeted screening or contact tracing accounted for only 4% of evaluations. Where reported, the majority of evaluations were of adults (91/154, 59%), 28 (18%) were in mixed populations with only seven (4%) in children. The median prevalence of confirmed SARS-CoV-2 was 23% (interquartile (IQR) 13%-40%). Risk of bias and applicability assessment were hampered by poor reporting in 77% and 65% of included studies, respectively. Risk of bias was low across all domains in only 3% of evaluations due to inappropriate inclusion or exclusion criteria, unclear recruitment, lack of blinding, nonrandomised sampling order or differences in testing kit within a sample pair. Sixty-eight percent of evaluation cohorts were judged as being at high or unclear applicability concern either due to inflation of the prevalence of SARS-CoV-2 infection in study populations by selectively including individuals with confirmed PCR-positive samples or because there was insufficient detail to allow replication of sample collection. When used with RT-PCR • There was no evidence of a difference in sensitivity between gargle and nasopharyngeal samples (on average -1 percentage points, 95% CI -5 to +2, based on 6 evaluations, 2138 sample pairs, of which 389 had SARS-CoV-2). • There was no evidence of a difference in sensitivity between saliva collection from the deep throat and nasopharyngeal samples (on average +10 percentage points, 95% CI -1 to +21, based on 2192 sample pairs, of which 730 had SARS-CoV-2). • There was evidence that saliva collection using spitting, drooling or salivating was on average -12 percentage points less sensitive (95% CI -16 to -8, based on 27,253 sample pairs, of which 4636 had SARS-CoV-2) compared to nasopharyngeal samples. We did not find any evidence of a difference in the sensitivity of saliva collected using spitting, drooling or salivating (sensitivity difference: range from -13 percentage points (spit) to -21 percentage points (salivate)). • Nasal samples (anterior and mid-turbinate collection combined) were, on average, 12 percentage points less sensitive compared to nasopharyngeal samples (95% CI -17 to -7), based on 9291 sample pairs, of which 1485 had SARS-CoV-2. We did not find any evidence of a difference in sensitivity between nasal samples collected from the mid-turbinates (3942 sample pairs) or from the anterior nares (8272 sample pairs). • There was evidence that oropharyngeal samples were, on average, 17 percentage points less sensitive than nasopharyngeal samples (95% CI -29 to -5), based on seven evaluations, 2522 sample pairs, of which 511 had SARS-CoV-2. A much smaller volume of evidence was available for combined nasal/oropharyngeal samples and oral samples. Age, symptom status and use of transport media do not appear to affect the sensitivity of saliva samples and nasal samples. When used with Ag-RDTs • There was no evidence of a difference in sensitivity between nasal samples compared to nasopharyngeal samples (sensitivity, on average, 0 percentage points -0.2 to +0.2, based on 3688 sample pairs, of which 535 had SARS-CoV-2).
AUTHORS' CONCLUSIONS: When used with RT-PCR, there is no evidence for a difference in sensitivity of self-collected gargle or deep-throat saliva samples compared to nasopharyngeal samples collected by healthcare workers when used with RT-PCR. Use of these alternative, self-collected sample types has the potential to reduce cost and discomfort and improve the safety of sampling by reducing risk of transmission from aerosol spread which occurs as a result of coughing and gagging during the nasopharyngeal or oropharyngeal sample collection procedure. This may, in turn, improve access to and uptake of testing. Other types of saliva, nasal, oral and oropharyngeal samples are, on average, less sensitive compared to healthcare worker-collected nasopharyngeal samples, and it is unlikely that sensitivities of this magnitude would be acceptable for confirmation of SARS-CoV-2 infection with RT-PCR. When used with Ag-RDTs, there is no evidence of a difference in sensitivity between nasal samples and healthcare worker-collected nasopharyngeal samples for detecting SARS-CoV-2. The implications of this for self-testing are unclear as evaluations did not report whether nasal samples were self-collected or collected by healthcare workers. Further research is needed in asymptomatic individuals, children and in Ag-RDTs, and to investigate the effect of operator expertise on accuracy. Quality assessment of the evidence base underpinning these conclusions was restricted by poor reporting. There is a need for further high-quality studies, adhering to reporting standards for test accuracy studies.
样本采集是严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染诊断准确性的关键驱动因素。病毒载量在不同解剖采样部位可能有所不同,获取标本的困难以及采样人员的专业水平可能会影响诊断准确性。在成本、安全性和可及性的限制范围内优化采样准确性非常重要。
比较不同采样部位和方法在使用任何基于分子或抗原的检测方法检测当前SARS-CoV-2感染时的敏感性。
于2022年2月22日对Cochrane COVID-19研究注册库和伯尔尼大学的COVID-19实时证据数据库进行了电子检索(该数据库包括来自PubMed和Embase的每日更新以及来自medRxiv和bioRxiv的预印本)。我们纳入了国家参考实验室、FIND和诊断全球健康网站的独立评估。我们未设置语言限制。
我们纳入了对疑似SARS-CoV-2感染进行检测的有症状或无症状人群的研究。我们纳入了任何设计类型的研究,这些研究比较了在24小时内从同一参与者采集的不同样本类型(解剖位置、操作人员、采集设备)的结果。
在一个样本对中,我们定义了一个参考样本和一个在同一临床就诊期间(24小时内)从同一参与者采集的索引样本。当样本比较是不同解剖部位时,参考标准定义为采集到同一样本容器中的鼻咽或联合鼻咽/口咽样本,索引样本为替代解剖部位。当样本比较涉及同一部位样本采集方法的差异时,我们将参考样本定义为最接近该样本类型标准操作方法的样本。当样本对比较涉及采集样本的人员差异时,技术更熟练或经验更丰富的操作人员被视为参考样本。两位综述作者使用为本综述量身定制的QUADAS-2和QUADAS-C清单独立评估偏倚风险和适用性问题。我们使用森林图和表格展示了一对样本中敏感性差异(参考样本(%)减去索引样本敏感性(%))的估计值,并作为每种索引采样方法的研究平均值。我们根据人群(年龄、症状状态)和索引样本(症状出现后的时间、操作人员专业水平、运输介质的使用)特征检查了研究之间的异质性。
本综述包括106项研究,报告了154项评估和60523对样本比较,其中11045对样本存在SARS-CoV-2感染。90项评估是关于唾液样本,37项是鼻样本,7项是口咽样本,6项是漱口液样本,6项是口腔样本,4项是联合鼻/口咽样本。4项评估是关于操作人员专业水平对三种不同样本类型准确性的影响。纳入的评估中大多数(146项)使用分子检测,其中140项使用逆转录聚合酶链反应(RT-PCR)。8项评估是关于与快速抗原检测(Ag-RDT)一起使用的鼻样本。大多数研究在欧洲(35/106,33%)或美国(27%)进行,在专门的COVID-19检测诊所或门诊医院环境中进行(53%)。针对性筛查或接触者追踪仅占评估的4%。在有报告的情况下,大多数评估是针对成年人(91/154,59%),28项(18%)是在混合人群中,只有7项(4%)是针对儿童。确诊SARS-CoV-2的中位患病率为23%(四分位间距(IQR)13%-40%)。分别有77%和65%的纳入研究报告不佳,这妨碍了偏倚风险和适用性评估。由于纳入或排除标准不当、招募不明确、缺乏盲法、非随机采样顺序或样本对中检测试剂盒的差异,仅3%的评估在所有领域的偏倚风险较低。68%的评估队列被判定为具有高或不明确的适用性问题,这要么是因为通过选择性纳入PCR确诊阳性样本的个体使研究人群中SARS-CoV-2感染的患病率升高,要么是因为没有足够的细节来允许复制样本采集。当与RT-PCR一起使用时
• 没有证据表明漱口液样本和鼻咽样本之间的敏感性存在差异(平均相差 -1个百分点,95%置信区间 -5至+2,基于6项评估,2138对样本,其中389对样本存在SARS-CoV-2)。
• 没有证据表明从深部喉咙采集的唾液样本和鼻咽样本之间的敏感性存在差异(平均相差 +10个百分点,95%置信区间 -1至+21,基于2192对样本,其中730对样本存在SARS-CoV-2)。
• 有证据表明与鼻咽样本相比,使用吐痰、流口水或分泌唾液方式采集的唾液样本平均敏感性低12个百分点(95%置信区间 -16至 -8,基于27253对样本,其中4636对样本存在SARS-CoV-2)。我们没有发现使用吐痰、流口水或分泌唾液方式采集的唾液样本在敏感性上存在差异的证据(敏感性差异:范围从 -13个百分点(吐痰)到 -21个百分点(分泌唾液))。
• 基于9291对样本,其中1485对样本存在SARS-CoV-2,鼻样本(中鼻甲前部和中部采集合并)与鼻咽样本相比平均敏感性低12个百分点(95%置信区间 -17至 -7)。我们没有发现从中鼻甲采集的鼻样本(3942对样本)或从鼻孔前部采集的鼻样本(8272对样本)在敏感性上存在差异的证据。
• 基于七项评估,2522对样本,其中511对样本存在SARS-CoV-2,有证据表明口咽样本平均敏感性比鼻咽样本低17个百分点(95%置信区间 -29至 -5)。关于联合鼻/口咽样本和口腔样本的证据量要少得多。年龄、症状状态和运输介质的使用似乎不会影响唾液样本和鼻样本的敏感性。当与Ag-RDT一起使用时
• 没有证据表明鼻样本与鼻咽样本之间的敏感性存在差异(敏感性平均为0个百分点,-0.2至+0.2,基于3688对样本,其中535对样本存在SARS-CoV-2)。
当与RT-PCR一起使用时,没有证据表明自我采集的漱口液或深部喉咙唾液样本与医护人员采集的鼻咽样本在敏感性上存在差异。使用这些替代的、自我采集的样本类型有可能降低成本和不适感,并通过降低鼻咽或口咽样本采集过程中因咳嗽和作呕导致气溶胶传播的传播风险来提高采样安全性。这反过来可能会改善检测的可及性和接受度。与医护人员采集的鼻咽样本相比,其他类型的唾液、鼻、口腔和口咽样本平均敏感性较低,这种程度的敏感性对于使用RT-PCR确诊SARS-CoV-2感染而言不太可能被接受。当与Ag-RDT一起使用时,没有证据表明鼻样本与医护人员采集的鼻咽样本在检测SARS-CoV-2时的敏感性存在差异。由于评估未报告鼻样本是自我采集还是由医护人员采集,这对自我检测的影响尚不清楚。需要对无症状个体、儿童以及Ag-RDT进行进一步研究,并调查操作人员专业水平对准确性影响。这些结论所依据的证据基础的质量评估受到报告不佳的限制。需要进一步开展高质量研究,遵循检测准确性研究的报告标准。