Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK.
Bioinformatics. 2012 Nov 15;28(22):2991-3. doi: 10.1093/bioinformatics/bts544. Epub 2012 Sep 26.
Spoligotyping is a well-established genotyping technique based on the presence of unique DNA sequences in Mycobacterium tuberculosis (Mtb), the causal agent of tuberculosis disease (TB). Although advances in sequencing technologies are leading to whole-genome bacterial characterization, tens of thousands of isolates have been spoligotyped, giving a global view of Mtb strain diversity. To bridge the gap, we have developed SpolPred, a software to predict the spoligotype from raw sequence reads. Our approach is compared with experimentally and de novo assembly determined strain types in a set of 44 Mtb isolates. In silico and experimental results are identical for almost all isolates (39/44). However, SpolPred detected five experimentally false spoligotypes and was more accurate and faster than the assembling strategy. Application of SpolPred to an additional seven isolates with no laboratory data led to types that clustered with identical experimental types in a phylogenetic analysis using single-nucleotide polymorphisms. Our results demonstrate the usefulness of the tool and its role in revealing experimental limitations.
SpolPred is written in C and is available from www.pathogenseq.org/spolpred.
Supplementary data are available at Bioinformatics Online.
spoligotyping 是一种基于结核分枝杆菌(Mtb)中独特 DNA 序列存在的成熟基因分型技术,Mtb 是结核病(TB)的病原体。尽管测序技术的进步正在导致全基因组细菌特征描述,但已经对成千上万的分离株进行了 spoligotyping,从而全面了解了 Mtb 菌株的多样性。为了弥补这一差距,我们开发了 SpolPred,这是一种从原始序列读取中预测 spoligotype 的软件。我们的方法在一组 44 株 Mtb 分离株中与实验和从头组装确定的菌株类型进行了比较。对于几乎所有的分离株(39/44),计算和实验结果都是相同的。然而,SpolPred 检测到五个实验上的假 spoligotype,并且比组装策略更准确和更快。SpolPred 在另外七个没有实验室数据的分离株上的应用导致在基于单核苷酸多态性的系统发育分析中,它们与相同的实验类型聚类。我们的结果证明了该工具的有用性及其在揭示实验限制方面的作用。
SpolPred 是用 C 语言编写的,可以从 www.pathogenseq.org/spolpred 获得。
补充数据可在生物信息学在线获得。