Krasilnikova Lydia A, Tomkins-Tinch Christopher H, Gayton Alton C, Schaffner Stephen F, Dobbins Sabrina T, Gladden-Young Adrianne, Siddle Katherine J, Park Daniel J, Sabeti Pardis C
Howard Hughes Medical Institute, Chevy Chase, MD 20815, United States.
Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA 02142, United States.
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae698.
In viral genomic research and surveillance, inter-sample contamination can affect variant detection, analysis of within-host evolution, outbreak reconstruction, and detection of superinfections and recombination events. While sample barcoding methods exist to track inter-sample contamination, they are not always used and can only detect contamination in the experimental pipeline from the point they are added. The underlying genomic information in a sample, however, carries information about inter-sample contamination occurring at any stage. Here, we present Polyphonia, a tool for detecting inter-sample contamination directly from deep sequencing data without the need for additional controls, using intrahost variant frequencies. We apply Polyphonia to 1102 SARS-CoV-2 samples sequenced at the Broad Institute and already tracked using molecular barcoding for comparison.
Polyphonia is available as a standalone Docker image and is also included as part of viral-ngs, available in Dockstore. Full documentation, source code, and instructions for use are available at https://github.com/broadinstitute/polyphonia.
在病毒基因组研究和监测中,样本间污染会影响变异检测、宿主内进化分析、疫情重建以及超级感染和重组事件的检测。虽然存在样本条形码方法来追踪样本间污染,但它们并非总是被使用,并且只能从添加它们的点开始检测实验流程中的污染。然而,样本中的潜在基因组信息携带了在任何阶段发生的样本间污染的信息。在这里,我们展示了Polyphonia,这是一种无需额外对照即可直接从深度测序数据中使用宿主内变异频率检测样本间污染的工具。我们将Polyphonia应用于布罗德研究所测序的1102个SARS-CoV-2样本,这些样本已经使用分子条形码进行了追踪以便进行比较。
Polyphonia可作为独立的Docker镜像获取,并且也作为viral-ngs的一部分包含在内,可在Dockstore中获取。完整的文档、源代码和使用说明可在https://github.com/broadinstitute/polyphonia获得。