Reyes Josephine F, Francis Andrew R, Tanaka Mark M
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney 2052, Australia.
BMC Bioinformatics. 2008 Nov 27;9:496. doi: 10.1186/1471-2105-9-496.
Molecular typing methods are commonly used to study genetic relationships among bacterial isolates. Many of these methods have become standardized and produce portable data. A popular approach for analyzing such data is to construct graphs, including phylogenies. Inferences from graph representations of data assist in understanding the patterns of transmission of bacterial pathogens, and basing these graph constructs on biological models of evolution of the molecular marker helps make these inferences. Spoligotyping is a widely used method for genotyping isolates of Mycobacterium tuberculosis that exploits polymorphism in the direct repeat region. Our goal was to examine a range of models describing the evolution of spoligotypes in order to develop a visualization method to represent likely relationships among M. tuberculosis isolates.
We found that inferred mutations of spoligotypes frequently involve the loss of a single or very few adjacent spacers. Using a second-order variant of Akaike's Information Criterion, we selected the Zipf model as the basis for resolving ambiguities in the ancestry of spoligotypes. We developed a method to construct graphs of spoligotypes (which we call spoligoforests). To demonstrate this method, we applied it to a tuberculosis data set from Cuba and compared the method to some existing methods.
We propose a new approach in analyzing relationships of M. tuberculosis isolates using spoligotypes. The spoligoforest recovers a plausible history of transmission and mutation events based on the selected deletion model. The method may be suitable to study markers based on loci of similar structure from other bacteria. The groupings and relationships in the spoligoforest can be analyzed along with the clinical features of strains to provide an understanding of the evolution of spoligotypes.
分子分型方法常用于研究细菌分离株之间的遗传关系。其中许多方法已实现标准化并能产生便于携带的数据。一种分析此类数据的常用方法是构建图表,包括系统发育树。从数据的图表表示中进行推断有助于理解细菌病原体的传播模式,并且基于分子标记进化的生物学模型构建这些图表有助于进行这些推断。间隔寡核苷酸分型(Spoligotyping)是一种广泛用于结核分枝杆菌分离株基因分型的方法,它利用直接重复区域的多态性。我们的目标是研究一系列描述间隔寡核苷酸分型(Spoligotype)进化的模型,以便开发一种可视化方法来表示结核分枝杆菌分离株之间可能的关系。
我们发现推断的间隔寡核苷酸分型(Spoligotype)突变通常涉及单个或极少数相邻间隔序列的缺失。使用赤池信息准则的二阶变体,我们选择齐普夫模型作为解决间隔寡核苷酸分型(Spoligotype)祖先歧义的基础。我们开发了一种构建间隔寡核苷酸分型(Spoligotype)图表的方法(我们称之为间隔寡核苷酸分型森林)。为了证明该方法,我们将其应用于来自古巴的结核病数据集,并将该方法与一些现有方法进行了比较。
我们提出了一种使用间隔寡核苷酸分型(Spoligotype)分析结核分枝杆菌分离株关系的新方法。间隔寡核苷酸分型森林基于选定的缺失模型恢复了传播和突变事件的合理历史。该方法可能适用于研究来自其他细菌的具有相似结构位点的标记。可以结合菌株的临床特征分析间隔寡核苷酸分型森林中的分组和关系,以了解间隔寡核苷酸分型(Spoligotype)的进化。