化脓性链球菌标准株的全基因组序列揭示了 PacBio 单分子实时测序和 Illumina-Oxford Nanopore 混合组装之间的 100%匹配。

Complete genome sequences of Streptococcus pyogenes type strain reveal 100%-match between PacBio-solo and Illumina-Oxford Nanopore hybrid assemblies.

机构信息

Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 413 46, Gothenburg, Sweden.

Culture Collection University of Gothenburg (CCUG), Sahlgrenska Academy, University of Gothenburg, 413 46, Gothenburg, Sweden.

出版信息

Sci Rep. 2020 Jul 15;10(1):11656. doi: 10.1038/s41598-020-68249-y.

DOI:10.1038/s41598-020-68249-y

PMID:32669560

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7363880/

Abstract

We present the first complete, closed genome sequences of Streptococcus pyogenes strains NCTC 8198 and CCUG 4207, the type strain of the type species of the genus Streptococcus and an important human pathogen that causes a wide range of infectious diseases. S. pyogenes NCTC 8198 and CCUG 4207 are derived from deposit of the same strain at two different culture collections. NCTC 8198 was sequenced, using a PacBio platform; the genome sequence was assembled de novo, using HGAP. CCUG 4207 was sequenced and a de novo hybrid assembly was generated, using SPAdes, combining Illumina and Oxford Nanopore sequence reads. Both strategies yielded closed genome sequences of 1,914,862 bp, identical in length and sequence identity. Combining short-read Illumina and long-read Oxford Nanopore sequence data circumvented the expected error rate of the nanopore sequencing technology, producing a genome sequence indistinguishable to the one determined with PacBio. Sequence analyses revealed five prophage regions, a CRISPR-Cas system, numerous virulence factors and no relevant antibiotic resistance genes. These two complete genome sequences of the type strain of S. pyogenes will effectively serve as valuable taxonomic and genomic references for infectious disease diagnostics, as well as references for future studies and applications within the genus Streptococcus.

摘要

我们呈现了酿脓链球菌 NCTC 8198 株和 CCUG 4207 株的首个完整的、闭合的基因组序列，它们分别是链球菌属的模式种的模式株和一种引起广泛感染性疾病的重要人类病原体。酿脓链球菌 NCTC 8198 株和 CCUG 4207 株来源于同一菌株在两个不同培养物保藏库的储存。NCTC 8198 株使用 PacBio 平台测序，其基因组序列通过 HGAP 从头组装。CCUG 4207 株测序并生成了从头杂交组装，使用了 SPAdes，结合了 Illumina 和 Oxford Nanopore 测序reads。这两种策略均产生了 1,914,862 bp 的闭合基因组序列，长度和序列同一性完全相同。结合 Illumina 的短读和 Oxford Nanopore 的长读序列数据，规避了纳米孔测序技术的预期错误率，产生的基因组序列与使用 PacBio 确定的序列无法区分。序列分析揭示了五个原噬菌体区域、一个 CRISPR-Cas 系统、许多毒力因子和没有相关的抗生素耐药基因。这两个酿脓链球菌的模式株的完整基因组序列将有效地作为感染性疾病诊断的有价值的分类学和基因组参考，以及链球菌属内未来研究和应用的参考。