Borrayo Ernesto, Mendizabal-Ruiz E Gerardo, Vélez-Pérez Hugo, Romo-Vázquez Rebeca, Mendizabal Adriana P, Morales J Alejandro
Computer Sciences Department, CUCEI - Universidad de Guadalajara, Guadalajara, México.
Molecular Biology Laboratory, Farmacobiology Department, CUCEI - Universidad de Guadalajara, Guadalajara, México.
PLoS One. 2014 Nov 13;9(11):e110954. doi: 10.1371/journal.pone.0110954. eCollection 2014.
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
基因组信号处理(GSP)是指利用数字信号处理(DSP)工具来分析诸如DNA序列等基因组数据。GSP一个尚未得到充分探索的可能应用是计算一对序列之间的距离。在这项工作中,我们提出了GAFD,一种新颖的无GSP比对的距离计算方法。我们基于双峰值的使用引入了一种DNA序列到信号的映射函数,这增加了生成信号的可能幅度值的数量。此外,我们探索使用三种DSP距离度量作为描述符来对DNA信号片段进行分类。我们的结果表明了使用GAFD计算序列距离以及使用描述符表征DNA片段的可行性。