Pokrzywa Rafal
Department of Computer Science, Silesian University of Technology, ul. Akademicka 16, 44-100 Gliwice, Poland.
Int J Bioinform Res Appl. 2009;5(4):432-46. doi: 10.1504/IJBRA.2009.027517.
Genomic sequences contain a variety of repeated structures of various lengths and types, interspersed or in tandem. Repetitive structures play an important role in molecular biology; they are related to the genetic backgrounds of inherited diseases, and they can also serve as markers for DNA mapping and DNA fingerprinting. Since biological databases keep growing in size and number there is a need for creating new tools for finding repeats in genomic sequences. This paper presents a new method for searching for tandem repeats in DNA sequences. It is based on the Burrows-Wheeler Transform (BWT), a very fast and effective data compression algorithm.
基因组序列包含各种长度和类型的重复结构,这些结构相互穿插或串联排列。重复结构在分子生物学中起着重要作用;它们与遗传疾病的遗传背景相关,还可作为DNA图谱绘制和DNA指纹识别的标记。由于生物数据库在规模和数量上不断增长,因此需要创建新工具来查找基因组序列中的重复序列。本文提出了一种在DNA序列中搜索串联重复序列的新方法。该方法基于Burrows-Wheeler变换(BWT),这是一种非常快速有效的数据压缩算法。