Suppr超能文献

使用图形拟阵方法对多位点分型数据进行全局最优eBURST分析。

Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach.

作者信息

Francisco Alexandre P, Bugalho Miguel, Ramirez Mário, Carriço João A

机构信息

Instituto de Engenharia de Sistemas e Computadores - ID em Lisboa, Lisboa, Portugal.

出版信息

BMC Bioinformatics. 2009 May 18;10:152. doi: 10.1186/1471-2105-10-152.

Abstract

BACKGROUND

Multilocus Sequence Typing (MLST) is a frequently used typing method for the analysis of the clonal relationships among strains of several clinically relevant microbial species. MLST is based on the sequence of housekeeping genes that result in each strain having a distinct numerical allelic profile, which is abbreviated to a unique identifier: the sequence type (ST). The relatedness between two strains can then be inferred by the differences between allelic profiles. For a more comprehensive analysis of the possible patterns of evolutionary descent, a set of rules were proposed and implemented in the eBURST algorithm. These rules allow the division of a data set into several clusters of related strains, dubbed clonal complexes, by implementing a simple model of clonal expansion and diversification. Within each clonal complex, the rules identify which links between STs correspond to the most probable pattern of descent. However, the eBURST algorithm is not globally optimized, which can result in links, within the clonal complexes, that violate the rules proposed.

RESULTS

Here, we present a globally optimized implementation of the eBURST algorithm - goeBURST. The search for a global optimal solution led to the formalization of the problem as a graphic matroid, for which greedy algorithms that provide an optimal solution exist. Several public data sets of MLST data were tested and differences between the two implementations were found and are discussed for five bacterial species: Enterococcus faecium, Streptococcus pneumoniae, Burkholderia pseudomallei, Campylobacter jejuni and Neisseria spp.. A novel feature implemented in goeBURST is the representation of the level of tiebreak rule reached before deciding if a link should be drawn, which can used to visually evaluate the reliability of the represented hypothetical pattern of descent.

CONCLUSION

goeBURST is a globally optimized implementation of the eBURST algorithm, that identifies alternative patterns of descent for several bacterial species. Furthermore, the algorithm can be applied to any multilocus typing data based on the number of differences between numeric profiles. A software implementation is available at http://goeBURST.phyloviz.net.

摘要

背景

多位点序列分型(MLST)是一种常用的分型方法,用于分析几种临床相关微生物物种菌株之间的克隆关系。MLST基于管家基因的序列,这使得每个菌株都有一个独特的数字等位基因谱,简称为唯一标识符:序列型(ST)。然后可以通过等位基因谱之间的差异推断两个菌株之间的亲缘关系。为了更全面地分析可能的进化谱系模式,提出了一组规则并在eBURST算法中实现。这些规则通过实施一个简单的克隆扩增和多样化模型,将数据集划分为几个相关菌株的簇,称为克隆复合体。在每个克隆复合体内,这些规则确定ST之间哪些联系对应于最可能的谱系模式。然而,eBURST算法并非全局优化的,这可能导致克隆复合体内的联系违反所提出的规则。

结果

在此,我们展示了eBURST算法的全局优化实现——goeBURST。对全局最优解的搜索导致将该问题形式化为一个图形拟阵,对于该拟阵存在能提供最优解的贪心算法。对几个MLST数据的公共数据集进行了测试,发现了两种实现之间的差异,并针对五种细菌物种进行了讨论:粪肠球菌、肺炎链球菌、类鼻疽伯克霍尔德菌、空肠弯曲菌和奈瑟菌属。goeBURST中实现的一个新特性是在决定是否绘制一条联系之前所达到的平局决胜规则水平的表示,这可用于直观地评估所表示的假设谱系模式的可靠性。

结论

goeBURST是eBURST算法的全局优化实现,它识别了几种细菌物种的替代谱系模式。此外,该算法可应用于基于数字谱之间差异数量的任何多位点分型数据。可在http://goeBURST.phyloviz.net获得软件实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b632/2705362/68cdbc8d8944/1471-2105-10-152-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验