ReadItAndKeep：快速清除 SARS-CoV-2 测序reads。

ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads.

机构信息

EMBL-EBI, Cambridge CB10 1SD, UK.

Nuffield Department of Medicine, University of Oxford, Oxford OX3 9DU, UK.

出版信息

Bioinformatics. 2022 Jun 13;38(12):3291-3293. doi: 10.1093/bioinformatics/btac311.

DOI:10.1093/bioinformatics/btac311

PMID:35551365

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9191204/

Abstract

SUMMARY

Viral sequence data from clinical samples frequently contain contaminating human reads, which must be removed prior to sharing for legal and ethical reasons. To enable host read removal for SARS-CoV-2 sequencing data on low-specification laptops, we developed ReadItAndKeep, a fast lightweight tool for Illumina and nanopore data that only keeps reads matching the SARS-CoV-2 genome. Peak RAM usage is typically below 10 MB, and runtime less than 1 min. We show that by excluding the polyA tail from the viral reference, ReadItAndKeep prevents bleed-through of human reads, whereas mapping to the human genome lets some reads escape. We believe our test approach (including all possible reads from the human genome, human samples from each of the 26 populations in the 1000 genomes data and a diverse set of SARS-CoV-2 genomes) will also be useful for others.

AVAILABILITY AND IMPLEMENTATION

ReadItAndKeep is implemented in C++, released under the MIT license, and available from https://github.com/GenomePathogenAnalysisService/read-it-and-keep.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

出于法律和伦理原因，在分享临床样本的病毒序列数据之前，必须先去除其中含有的污染人类读段。为了能够在低规格笔记本电脑上对 SARS-CoV-2 测序数据进行宿主读段去除，我们开发了 ReadItAndKeep，这是一个用于 Illumina 和纳米孔数据的快速轻量级工具，它只保留与 SARS-CoV-2 基因组匹配的读段。峰值 RAM 使用量通常低于 10MB，运行时间不到 1 分钟。我们表明，通过从病毒参考序列中排除 polyA 尾巴，ReadItAndKeep 可以防止人类读段的串扰，而映射到人类基因组则会让一些读段逃脱。我们相信我们的测试方法（包括人类基因组的所有可能读段、来自 1000 基因组数据中 26 个人群的每个人群的人类样本以及一组多样化的 SARS-CoV-2 基因组）对其他人也将是有用的。

可用性和实施

ReadItAndKeep 是用 C++ 实现的，根据 MIT 许可证发布，并可从 https://github.com/GenomePathogenAnalysisService/read-it-and-keep 获得。

补充信息

补充数据可在生物信息学在线获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

ReadItAndKeep：快速清除 SARS-CoV-2 测序reads。

ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

摘要

可用性和实施

补充信息

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

ReadItAndKeep：快速清除 SARS-CoV-2 测序reads。

ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

摘要

可用性和实施

补充信息