Walter Katharine S, Kim Eugene, Verma Renu, Altamirano Jonathan, Leary Sean, Carrington Yuan J, Jagannathan Prasanna, Singh Upinder, Holubar Marisa, Subramanian Aruna, Khosla Chaitan, Maldonado Yvonne, Andrews Jason R
Division of Epidemiology, University of Utah, Salt Lake City, Utah, USA.
Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, California, USA.
Open Forum Infect Dis. 2023 Jan 7;10(2):ofad001. doi: 10.1093/ofid/ofad001. eCollection 2023 Feb.
The limited variation observed among severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) consensus sequences makes it difficult to reconstruct transmission linkages in outbreak settings. Previous studies have recovered variation within individual SARS-CoV-2 infections but have not yet measured the informativeness of within-host variation for transmission inference.
We performed tiled amplicon sequencing on 307 SARS-CoV-2 samples, including 130 samples from 32 individuals in 14 households and 47 longitudinally sampled individuals, from 4 prospective studies with household membership data, a proxy for transmission linkage.
Consensus sequences from households had limited diversity (mean pairwise distance, 3.06 single-nucleotide polymorphisms [SNPs]; range, 0-40). Most (83.1%, 255 of 307) samples harbored at least 1 intrahost single-nucleotide variant ([iSNV] median, 117; interquartile range [IQR], 17-208), above a minor allele frequency threshold of 0.2%. Pairs in the same household shared significantly more iSNVs (mean, 1.20 iSNVs; 95% confidence interval [CI], 1.02-1.39) than did pairs in different households infected with the same viral clade (mean, 0.31 iSNVs; 95% CI, .28-.34), a signal that decreases with increasingly stringent minor allele frequency thresholds. The number of shared iSNVs was significantly associated with an increased odds of household membership (adjusted odds ratio, 1.35; 95% CI, 1.23-1.49). However, the poor concordance of iSNVs detected across sequencing replicates (24.8% and 35.0% above a 0.2% and 1% threshold) confirms technical concerns that current sequencing and bioinformatic workflows do not consistently recover low-frequency within-host variants.
Shared within-host variation may augment the information in consensus sequences for predicting transmission linkages. Improving sensitivity and specificity of within-host variant identification will improve the informativeness of within-host variation.
严重急性呼吸综合征冠状病毒2(SARS-CoV-2)一致序列中观察到的有限变异使得在疫情暴发环境中重建传播联系变得困难。先前的研究已在个体SARS-CoV-2感染中发现了变异,但尚未衡量宿主内变异对传播推断的信息量。
我们对307份SARS-CoV-2样本进行了平铺扩增子测序,其中包括来自14个家庭中32个人的130份样本以及47个纵向采样个体的样本,这些样本来自4项有家庭成员数据(传播联系的一个替代指标)的前瞻性研究。
来自家庭的一致序列多样性有限(平均成对距离为3.06个单核苷酸多态性[SNPs];范围为0-40)。大多数(83.1%,307份样本中的255份)样本含有至少1个宿主内单核苷酸变异([iSNV]中位数为117;四分位间距[IQR]为17-208),次要等位基因频率阈值为0.2%。同一家庭中的配对共享的iSNV显著多于感染相同病毒分支的不同家庭中的配对(平均为1.20个iSNV;95%置信区间[CI]为1.02-1.39),该信号随着次要等位基因频率阈值变得越来越严格而减弱。共享iSNV的数量与家庭成员关系的几率增加显著相关(调整后的优势比为1.35;95%CI为1.23-1.49)。然而,在不同测序重复中检测到的iSNV一致性较差(在0.2%和1%阈值以上分别为24.8%和35.0%),这证实了当前测序和生物信息学工作流程不能始终如一地检测到低频宿主内变异的技术问题。
共享的宿主内变异可能会增加一致序列中用于预测传播联系的信息。提高宿主内变异识别的敏感性和特异性将提高宿主内变异的信息量。