Suppr超能文献

Crykey:快速识别废水中的新型冠状病毒隐匿突变

Crykey: Rapid Identification of SARS-CoV-2 Cryptic Mutations in Wastewater.

作者信息

Liu Yunxi, Sapoval Nicolae, Gallego-García Pilar, Tomás Laura, Posada David, Treangen Todd J, Stadler Lauren B

机构信息

Department of Computer Science, Rice University, Houston, TX, 77005, USA.

CINBIO, Universidade de Vigo, 36310 Vigo, Spain.

出版信息

medRxiv. 2023 Nov 12:2023.06.16.23291524. doi: 10.1101/2023.06.16.23291524.

Abstract

We present Crykey, a computational tool for rapidly identifying cryptic mutations of SARS-CoV-2. Specifically, we identify co-occurring single nucleotide mutations on the same sequencing read, called linked-read mutations, that are rare or entirely missing in existing databases, and have the potential to represent novel cryptic lineages found in wastewater. While previous approaches exist for identifying cryptic linked-read mutations from specific regions of the SARS-CoV-2 genome, there is a need for computational tools capable of efficiently tracking cryptic mutations across the entire genome and for tens of thousands of samples and with increased scrutiny, given their potential to represent either artifacts or hidden SARS-CoV-2 lineages. Crykey fills this gap by identifying rare linked-read mutations that pass stringent computational filters to limit the potential for artifacts. We evaluate the utility of Crykey on >3,000 wastewater and >22,000 clinical samples; our findings are three-fold: i) we identify hundreds of cryptic mutations that cover the entire SARS-CoV-2 genome, ii) we track the presence of these cryptic mutations across multiple wastewater treatment plants and over a three years of sampling in Houston, and iii) we find a handful of cryptic mutations in wastewater mirror cryptic mutations in clinical samples and investigate their potential to represent real cryptic lineages. In summary, Crykey enables large-scale detection of cryptic mutations representing potential cryptic lineages in wastewater.

摘要

我们展示了Crykey,这是一种用于快速识别严重急性呼吸综合征冠状病毒2(SARS-CoV-2)隐匿突变的计算工具。具体而言,我们识别在同一测序读数上同时出现的单核苷酸突变,即连锁读数突变,这些突变在现有数据库中很少见或完全缺失,并且有可能代表在废水中发现的新型隐匿谱系。虽然以前存在从SARS-CoV-2基因组的特定区域识别隐匿连锁读数突变的方法,但鉴于其可能代表人为产物或隐藏的SARS-CoV-2谱系,需要能够在整个基因组中高效追踪隐匿突变、针对数万个样本并进行更严格审查的计算工具。Crykey通过识别通过严格计算筛选以限制人为产物可能性的罕见连锁读数突变来填补这一空白。我们评估了Crykey在3000多个废水样本和22000多个临床样本上的效用;我们的发现有三点:i)我们识别出数百个覆盖整个SARS-CoV-2基因组的隐匿突变,ii)我们追踪了这些隐匿突变在多个污水处理厂以及休斯顿三年采样期间的存在情况,iii)我们在废水中发现了少数隐匿突变与临床样本中的隐匿突变相呼应,并研究了它们代表真正隐匿谱系的可能性。总之,Crykey能够大规模检测代表废水中潜在隐匿谱系的隐匿突变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae11/10659477/f35eed3439ba/nihpp-2023.06.16.23291524v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验