Tewolde Rediat, Dallman Timothy, Schaefer Ulf, Sheppard Carmen L, Ashton Philip, Pichon Bruno, Ellington Matthew, Swift Craig, Green Jonathan, Underwood Anthony
Infectious Disease Informatics Unit, Public Health England , London , United Kingdom.
Gastrointestinal Bacteria Reference Unit, Public Health England , London , United Kingdom.
PeerJ. 2016 Aug 17;4:e2308. doi: 10.7717/peerj.2308. eCollection 2016.
Multilocus sequence typing (MLST) is an effective method to describe bacterial populations. Conventionally, MLST involves Polymerase Chain Reaction (PCR) amplification of housekeeping genes followed by Sanger DNA sequencing. Public Health England (PHE) is in the process of replacing the conventional MLST methodology with a method based on short read sequence data derived from Whole Genome Sequencing (WGS). This paper reports the comparison of the reliability of MLST results derived from WGS data, comparing mapping and assembly-based approaches to conventional methods using 323 bacterial genomes of diverse species. The sensitivity of the two WGS based methods were further investigated with 26 mixed and 29 low coverage genomic data sets from Salmonella enteridis and Streptococcus pneumoniae. Of the 323 samples, 92.9% (n = 300), 97.5% (n = 315) and 99.7% (n = 322) full MLST profiles were derived by the conventional method, assembly- and mapping-based approaches, respectively. The concordance between samples that were typed by conventional (92.9%) and both WGS methods was 100%. From the 55 mixed and low coverage genomes, 89.1% (n = 49) and 67.3% (n = 37) full MLST profiles were derived from the mapping and assembly based approaches, respectively. In conclusion, deriving MLST from WGS data is more sensitive than the conventional method. When comparing WGS based methods, the mapping based approach was the most sensitive. In addition, the mapping based approach described here derives quality metrics, which are difficult to determine quantitatively using conventional and WGS-assembly based approaches.
多位点序列分型(MLST)是一种描述细菌群体的有效方法。传统上,MLST涉及对管家基因进行聚合酶链反应(PCR)扩增,然后进行桑格DNA测序。英国公共卫生署(PHE)正在用一种基于全基因组测序(WGS)产生的短读长序列数据的方法取代传统的MLST方法。本文报告了从WGS数据得出的MLST结果的可靠性比较,使用323个不同物种的细菌基因组,将基于比对和组装的方法与传统方法进行比较。利用肠炎沙门氏菌和肺炎链球菌的26个混合基因组数据集和29个低覆盖基因组数据集,进一步研究了两种基于WGS的方法的敏感性。在323个样本中,传统方法、基于组装的方法和基于比对的方法分别得出了92.9%(n = 300)、97.5%(n = 315)和99.7%(n = 322)的完整MLST图谱。传统方法(92.9%)和两种WGS方法分型的样本之间的一致性为100%。在55个混合和低覆盖基因组中,基于比对和基于组装的方法分别得出了89.1%(n = 49)和67.3%(n = 37)的完整MLST图谱。总之,从WGS数据得出MLST比传统方法更敏感。比较基于WGS的方法时,基于比对的方法最敏感。此外,这里描述的基于比对的方法得出了质量指标,而使用传统方法和基于WGS组装的方法难以定量确定这些指标。