U.S. Food and Drug Administration, Office of Commissioner, Commissioner's Fellowship Program, Silver Spring, Maryland, USA
U.S. Food and Drug Administration, Office of Regulatory Affairs, San Francisco Laboratory, Alameda, California, USA.
Appl Environ Microbiol. 2019 Mar 22;85(7). doi: 10.1128/AEM.00165-19. Print 2019 Apr 1.
Bacteria of the genus , consisting of 4 species and >50 serotypes, cause shigellosis, a foodborne disease of significant morbidity, mortality, and economic loss worldwide. Classical identification based on selective media and serology is tedious, time-consuming, expensive, and not always accurate. A molecular diagnostic assay does not distinguish at the species level or from enteroinvasive (EIEC). We inspected genomic sequences from 221 isolates and observed low concordance rates between conventional designation and molecular serotyping: 86.4% and 80.5% at the species and serotype levels, respectively. Serotype determinants for 6 additional serotypes were identified. Examination of differentiation gene markers commonly perceived as characteristic hallmarks in showed high variability among different serotypes. Using this information, we developed ShigaTyper, an automated workflow that utilizes limited computational resources to accurately and rapidly determine 59 serotypes using Illumina paired-end whole-genome sequencing (WGS) reads. serotype determinants and species-specific diagnostic markers were first identified through read alignment to an in-house curated reference sequence database. Relying on sequence hits that passed a threshold level of coverage and accuracy, serotype could be unambiguously predicted within 1 min for an average-size WGS sample of ∼500 MB. Validation with WGS data from 380 isolates showed an accuracy rate of 98.2%. This pipeline is the first step toward building a comprehensive WGS-based analysis pipeline of spp. in a field laboratory setting, where speed is essential and resources need to be more cost-effectively dedicated. causes diarrheal disease with serious public health implications. However, conventional identification methods are laborious and time-consuming and can be erroneous due to the high similarity between and enteroinvasive (EIEC) and cross-reactivity between serotyping antisera. Further, serotype interpretation is complicated for inexperienced users. To develop an easier method with higher accuracy based on whole-genome sequencing (WGS) for serotyping, we systematically examined genomic information of isolates from 53 serotypes to define rules for differentiation and serotyping. We created ShigaTyper, an automated pipeline that accurately and rapidly excludes non- isolates and identifies 59 serotypes using Illumina paired-end WGS reads. A serotype can be unambiguously predicted at a data processing speed of 538 MB/min with 98.2% accuracy from a regular laptop. Once it is installed, training in bioinformatics analysis and genetics is not required. This pipeline is particularly useful to general microbiologists in field laboratories.
肠杆菌科中的 属由 4 个种和 >50 个血清型组成,引起志贺氏菌病,这是一种具有重要发病率、死亡率和经济损失的食源性疾病。基于选择性培养基和血清学的经典鉴定方法繁琐、耗时、昂贵,并且并不总是准确。分子诊断检测不能在种水平上或与侵袭性肠杆菌 (EIEC) 区分开来。我们检查了来自 221 个分离株的基因组序列,发现传统命名法与分子血清型鉴定之间的一致性率较低:种和血清型水平分别为 86.4%和 80.5%。鉴定了另外 6 个血清型的血清型决定簇。对通常被认为是 不同血清型特征标志的分化基因标记进行了检查,结果表明它们之间的变异性很高。利用这些信息,我们开发了 ShigaTyper,这是一种自动化工作流程,利用有限的计算资源,使用 Illumina 配对末端全基因组测序 (WGS) 读取准确快速地确定 59 个血清型。 血清型决定簇和种特异性诊断标记物首先通过与内部 curated 参考序列数据库的读取比对来确定。依靠通过覆盖率和准确性阈值的序列命中,对于平均大小为 500 MB 的 WGS 样本,血清型可以在 1 分钟内进行明确预测。用来自 380 个分离株的 WGS 数据进行验证,准确率为 98.2%。该流水线是在现场实验室环境中构建基于全基因组测序 (WGS) 的 spp. 综合分析流水线的第一步,在这种环境中速度至关重要,需要更有效地利用资源。引起具有严重公共卫生意义的腹泻病。然而,传统的 鉴定方法繁琐且耗时,并且由于 和侵袭性肠杆菌 (EIEC) 之间的高度相似性以及血清型抗血清之间的交叉反应性,可能会出现错误。此外,经验不足的用户在进行血清型解释时会遇到困难。为了基于全基因组测序 (WGS) 开发一种更简单、更准确的 血清型鉴定方法,我们系统地研究了来自 53 个血清型的 分离株的基因组信息,以确定区分和血清型鉴定的规则。我们创建了 ShigaTyper,这是一种自动化管道,可使用 Illumina 配对末端 WGS 读取准确快速地排除非 分离株并鉴定 59 个血清型。以 538 MB/min 的数据处理速度,以 98.2%的准确率,可以从普通笔记本电脑中明确预测出血清型。安装后,不需要进行生物信息学分析和 遗传学方面的培训。该流水线特别适用于现场实验室的普通微生物学家。