Department of Earth and Environmental Sciences, National Chung Cheng University, Chiayi 621, Taiwan, ROC; Agricultural Biotechnology Research Center, Academia Sinica, Taipei 115, Taiwan, ROC.
Department of Internal Medicine, Cheng Hsin General Hospital, Taipei, Taiwan, ROC.
Sci Total Environ. 2017 Mar 1;581-582:378-385. doi: 10.1016/j.scitotenv.2016.12.144. Epub 2016 Dec 30.
Multilocus sequence typing (MLST) is an approach for prediction of Salmonella servoar and eBRUST groups (eBGs) based on seven typing scheme of housekeeping genes. Up to date, >220.000 allelic profiles and 65,973 Salmonella strains have been established in the MLST database. Several studies have modified MLST method with fewer targeted housekeeping genes for the purpose of economy and efficiency. Nevertheless, no study has conducted systematically to evaluate the correlation between the numbers of housekeeping genes targeted and the accuracy of prediction rate. In this study, we aimed to tackle this problem by extracting data from the MLST database as a whole using the software RStudio. Our results indicated that as the numbers of genes in MLST scheme increased, the accuracy of the eBGs prediction rate increased and reached 100% when the gene numbers are greater than or equal to 5. To examine the applicability of the approach, 395 environmental water samples were subjected to this study. A set of 52 Salmonella enterica isolates was initially used to develop MLST targeting seven housekeeping genes. A total of 29 sequence types, including 11 new sequence types were found among the 52 sequenced isolates that differentiated into 19 serotypes. Moreover, two novel sequence types did not belong to current classification. Our results show that the outcome in the three-gene sequence typing (aroC, hisD, and purE) was as accurate as in the seven-gene sequence typing for prediction of environmental Salmonella isolates. Our data suggested that this five-gene and reduced gene-number sequence-typing scheme can serve as an alternative modified MLST when effectiveness and financial management were the concerns.
多位点序列分型(MLST)是一种基于 7 个看家基因分型方案预测沙门氏菌血清型和 eBRUST 组(eBG)的方法。截至目前,MLST 数据库中已建立了超过 220000 个等位基因谱和 65973 株沙门氏菌菌株。为了经济和高效,已有多项研究对 MLST 方法进行了修改,采用了较少的靶向看家基因。然而,尚无研究系统地评估靶向看家基因数量与预测准确率之间的相关性。在本研究中,我们旨在通过使用 RStudio 软件从 MLST 数据库中整体提取数据来解决这个问题。我们的结果表明,随着 MLST 方案中基因数量的增加,eBG 预测准确率也随之提高,当基因数量大于或等于 5 时,准确率达到 100%。为了检验该方法的适用性,对 395 份环境水样进行了研究。首先使用一组 52 株沙门氏菌分离株进行 MLST 靶向 7 个看家基因。在 52 株测序分离株中发现了 29 种血清型,包括 11 种新的血清型。此外,有两种新型血清型不属于当前分类。我们的结果表明,在三基因序列分型(aroC、hisD 和 purE)中,预测环境沙门氏菌分离株的结果与七基因序列分型一样准确。我们的数据表明,当关注效果和财务管理时,这种五基因和减少基因数量的序列分型方案可以作为一种替代的改良 MLST。