Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón, Madrid, Spain.
Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain.
Virol J. 2024 May 30;21(1):121. doi: 10.1186/s12985-024-02347-5.
During the pandemic, whole genome sequencing was critical to characterize SARS-CoV-2 for surveillance, clinical and therapeutical purposes. However, low viral loads in specimens often led to suboptimal sequencing, making lineage assignment and phylogenetic analysis difficult. We propose an alternative approach to sequencing these specimens that involves sequencing in triplicate and concatenation of the reads obtained using bioinformatics. This proposal is based on the hypothesis that the uncovered regions in each replicate differ and that concatenation would compensate for these gaps and recover a larger percentage of the sequenced genome.
Whole genome sequencing was performed in triplicate on 30 samples with Ct > 32 and the benefit of replicate read concatenation was assessed. After concatenation: i) 28% of samples reached the standard quality coverage threshold (> 90% genome covered > 30x); ii) 39% of samples did not reach the coverage quality thresholds but coverage improved by more than 40%; and iii) SARS-CoV-2 lineage assignment was possible in 68.7% of samples where it had been impaired.
Concatenation of reads from replicate sequencing reactions provides a simple way to access hidden information in the large proportion of SARS-CoV-2-positive specimens eliminated from analysis in standard sequencing schemes. This approach will enhance our potential to rule out involvement in outbreaks, to characterize reinfections and to identify lineages of concern for surveillance or therapeutical purposes.
在大流行期间,全基因组测序对于 SARS-CoV-2 的监测、临床和治疗目的至关重要。然而,标本中的低病毒载量通常导致测序效果不佳,使得谱系分配和系统发育分析变得困难。我们提出了一种替代方法来对这些标本进行测序,涉及重复测序和使用生物信息学拼接获得的读取。该方案基于这样的假设,即每个重复中的未覆盖区域不同,并且拼接可以弥补这些差距并恢复更大比例的测序基因组。
对 30 个 Ct 值>32 的样本进行了三次重复全基因组测序,并评估了重复读取拼接的效果。拼接后:i)28%的样本达到了标准质量覆盖阈值(>90%基因组覆盖>30x);ii)39%的样本未达到覆盖质量阈值,但覆盖度提高了 40%以上;iii)在原本谱系分配受到影响的样本中,有 68.7%的样本可以进行 SARS-CoV-2 谱系分配。
重复测序反应的读取拼接为从标准测序方案中消除的大量 SARS-CoV-2 阳性标本中获取隐藏信息提供了一种简单的方法。这种方法将增强我们排除参与疫情爆发、鉴定再感染以及鉴定用于监测或治疗目的的关注谱系的能力。