Department of Biotechnology, Oswaldo Cruz Foundation, Eusébio 61773-270, Brazil.
Department of Medicine, Federal University of Ceará, Fortaleza 60430-160, Brazil.
Anal Chem. 2024 Nov 19;96(46):18537-18544. doi: 10.1021/acs.analchem.4c04492. Epub 2024 Nov 4.
Emerging and evolving Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) lineages, adapted to changing epidemiological conditions, present unprecedented challenges to global public health systems. Here, we introduce an adapted analytical approach that complements genomic sequencing, applying a cost-effective quantitative polymerase chain reaction (qPCR)-based assay. Viral RNA samples from SARS-CoV-2 positive cases detected by diagnostic laboratories or public health network units in Ceará, Brazil, were tracked for genomic surveillance and analyzed by using paired-end sequencing combined with integrative genomic analysis. Validation of a key structural variation was conducted with gel electrophoresis for the presence of a specific () gene deletion within the "BE.9" lineages tracked. The analytical innovation of our method is the optimization of a simple intercalating dye-based qPCR assay through repositioning primers from the ARTIC v4.1 amplicon panel to detect large molecular patterns. This assay distinguishes between "BE.9" and "non-BE.9" lineages, particularly BQ.1, without the need for expensive probes or sequencing. The protocol was validated against lineage predictions from next-generation sequencing (NGS) using 525 paired samples, achieving 93.3% sensitivity, 95.1% specificity, and 92.4% agreement, as measured by Cohen's Kappa coefficient. Machine learning (ML) models were trained using the melting curves from intercalating dye-based qPCR of 1724 samples, enabling highly accurate lineage assignment. Among them, the support vector machine (SVM) model had the best performance and after fine-tuning showed ∼96.52% (333/345) accuracy in comparison to the test data set. Our integrated approach provides an adapted analytical method that is both cost-effective and scalable, suitable for rapid assessment of emerging variants, especially in resource-limited settings. In this work, the protocol is applied to improve the monitoring of SARS-CoV-2 sublineages but can be extended to track any key molecular signature, including large insertions and deletions (indels) commonly observed in pathogenic agent subtypes. By offering a complement to traditional sequencing methods and utilizing easily trainable machine learning algorithms, our methodology contributes to enhanced molecular surveillance strategies and supports global efforts in pandemic control.
新兴和不断演变的严重急性呼吸系统综合征冠状病毒 2 (SARS-CoV-2) 谱系,适应不断变化的流行病学条件,对全球公共卫生系统带来前所未有的挑战。在这里,我们引入了一种经过适应性调整的分析方法,该方法补充了基因组测序,应用了具有成本效益的定量聚合酶链反应 (qPCR) 检测。从巴西塞阿拉州诊断实验室或公共卫生网络单位检测到的 SARS-CoV-2 阳性病例的病毒 RNA 样本,通过配对末端测序与综合基因组分析进行了基因组监测和分析。对“BE.9”谱系中特定 () 基因缺失的存在,通过凝胶电泳对关键结构变异进行了验证。我们方法的创新之处在于通过重新定位来自 ARTIC v4.1 扩增子面板的引物,优化了一种简单的嵌入染料 qPCR 检测方法,以检测大的分子模式。该检测方法可以区分“BE.9”和“非 BE.9”谱系,特别是 BQ.1,而无需昂贵的探针或测序。该方案通过使用 525 对配对样本对下一代测序 (NGS) 的谱系预测进行了验证,实现了 93.3%的敏感性、95.1%的特异性和 92.4%的一致性,这是通过 Cohen 的 Kappa 系数来衡量的。使用 1724 个样本的嵌入染料 qPCR 的熔解曲线,对机器学习 (ML) 模型进行了训练,能够进行非常准确的谱系分配。其中,支持向量机 (SVM) 模型的性能最好,经过微调后,与测试数据集相比,准确率约为 96.52%(333/345)。我们的综合方法提供了一种既具有成本效益又可扩展的适应性分析方法,适用于快速评估新兴变体,特别是在资源有限的环境中。在这项工作中,该方案用于改进 SARS-CoV-2 亚谱系的监测,但可以扩展到跟踪任何关键的分子特征,包括在病原体亚型中常见的大插入和缺失 (indels)。通过为传统测序方法提供补充,并利用易于训练的机器学习算法,我们的方法有助于增强分子监测策略,并支持全球在大流行控制方面的努力。