Joseph Jijoy, Sasikumar Roschen
Computational Modelling and Simulation, Regional Research Laboratory (CSIR), Thiruvananthapuram, 695019, India.
BMC Bioinformatics. 2006 May 5;7:243. doi: 10.1186/1471-2105-7-243.
Chaos game representation of genome sequences has been used for visual representation of genome sequence patterns as well as alignment-free comparisons of sequences based on oligonucleotide frequencies. However the potential of this representation for making alignment-based comparisons of whole genome sequences has not been exploited.
We present here a fast algorithm for identifying all local alignments between two long DNA sequences using the sequence information contained in CGR points. The local alignments can be depicted graphically in a dot-matrix plot or in text form, and the significant similarities and differences between the two sequences can be identified. We demonstrate the method through comparison of whole genomes of several microbial species. Given two closely related genomes we generate information on mismatches, insertions, deletions and shuffles that differentiate the two genomes.
Addition of the possibility of large scale sequence alignment to the repertoire of alignment-free sequence analysis applications of chaos game representation, positions CGR as a powerful sequence analysis tool.
基因组序列的混沌游戏表示法已被用于基因组序列模式的可视化表示以及基于寡核苷酸频率的序列无比对比较。然而,这种表示法在进行全基因组序列的基于比对的比较方面的潜力尚未得到开发。
我们在此提出一种快速算法,用于利用混沌游戏表示(CGR)点中包含的序列信息来识别两条长DNA序列之间的所有局部比对。局部比对可以在点阵图中以图形方式描绘或以文本形式呈现,并且可以识别两条序列之间的显著相似性和差异。我们通过比较几种微生物物种的全基因组来演示该方法。给定两个密切相关的基因组,我们生成有关区分这两个基因组的错配、插入、缺失和重排的信息。
将大规模序列比对的可能性添加到混沌游戏表示的无比对序列分析应用库中,使混沌游戏表示成为一种强大的序列分析工具。