Kania Adrian, Sarapata Krzysztof
Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Gronostajowa 7, 30-387 Cracow, Poland.
Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Gronostajowa 7, 30-387 Cracow, Poland.
Genomics. 2021 May;113(3):1428-1437. doi: 10.1016/j.ygeno.2021.03.015. Epub 2021 Mar 11.
Numerical representation of biological sequences plays an important role in bioinformatics and has many practical applications. One of the most popular approaches is the chaos game representation. In this paper, the authors propose a novel look into chaos game construction - an analytical description of this procedure. This type enables to build more general number sequences using different weight functions. The authors suggest three conditions that these functions should hold. Additionally, they present some criteria to compare them and check whether they provide a unique representation. One of the most important advantages of our approach is the possibility to construct such a description that is less sensitive to mutations and as a result, give more reliable values for free-alignment phylogenetic trees constructions. Finally, the authors applied the DFT method using four types of functions and compared the obtained results using the BLAST tool.
生物序列的数值表示在生物信息学中起着重要作用,并且有许多实际应用。最流行的方法之一是混沌游戏表示。在本文中,作者提出了一种对混沌游戏构造的新颖见解——对该过程的解析描述。这种类型能够使用不同的权重函数构建更通用的数字序列。作者提出了这些函数应满足的三个条件。此外,他们还提出了一些标准来比较这些函数,并检查它们是否提供唯一表示。我们方法的最重要优点之一是有可能构建对突变不太敏感的描述,从而为自由比对系统发育树构建提供更可靠的值。最后,作者使用四种类型的函数应用了离散傅里叶变换(DFT)方法,并使用BLAST工具比较了所得结果。