Suppr超能文献

惠勒图:基于Burrows-Wheeler变换的数据结构框架。

Wheeler graphs: A framework for BWT-based data structures.

作者信息

Gagie Travis, Manzini Giovanni, Sirén Jouni

机构信息

Diego Portales University and CEBIB, Santiago, Chile.

University of Eastern Piedmont, Alessandria, Italy.

出版信息

Theor Comput Sci. 2017 Oct 25;698:67-78. doi: 10.1016/j.tcs.2017.06.016.

Abstract

The famous Burrows-Wheeler Transform (BWT) was originally defined for a single string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs, etc. In this paper we propose a framework that includes many of these variations and that we hope will simplify the search for more. We first define and show they have a property we call . We show that if the state diagram of a finite-state automaton is a Wheeler graph then, by its path coherence, we can order the nodes such that, for any string, the nodes reachable from the initial state or states by processing that string are consecutive. This means that even if the automaton is non-deterministic, we can still store it compactly and process strings with it quickly. We then rederive several variations of the BWT by designing straightforward finite-state automata for the relevant problems and showing that their state diagrams are Wheeler graphs.

摘要

著名的Burrows-Wheeler变换(BWT)最初是为单个字符串定义的,但已经针对字符串集、标记树、德布鲁因图等开发了变体。在本文中,我们提出了一个框架,其中包括许多这些变体,并且我们希望这将简化对更多变体的搜索。我们首先定义并展示它们具有我们称为 的属性。我们表明,如果有限状态自动机的状态图是惠勒图,那么通过其路径连贯性,我们可以对节点进行排序,使得对于任何字符串,通过处理该字符串从初始状态或多个初始状态可达的节点是连续的。这意味着即使自动机是非确定性的,我们仍然可以紧凑地存储它并快速用它处理字符串。然后,我们通过为相关问题设计直接的有限状态自动机并表明它们的状态图是惠勒图,重新推导了BWT的几个变体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95f9/5727778/d067478b9a6c/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验