Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:3894-3897. doi: 10.1109/EMBC.2017.8037707.
In order to extract phylogenetic information from DNA sequences, alignment-free methods and alignment-based methods are used. Alignment-based methods have high complexity and conventional alignment-free methods have low accuracy. In this paper, a new alignment-free method based on the distribution of repeated k-word measure is proposed. This novel measure is based on k-words and its multiple repeated words. We can get higher performance than conventional word count methods in case of using proposed scheme while maintaining total time complexity. The proposed measure shows better performance compared to conventional alignment-free methods with respect to RF distance.
为了从DNA序列中提取系统发育信息,人们使用了无比对方法和基于比对的方法。基于比对的方法具有高复杂性,而传统的无比对方法准确性较低。本文提出了一种基于重复k字度量分布的新的无比对方法。这种新颖的度量基于k字及其多个重复词。在保持总时间复杂度的情况下,使用所提出的方案时,我们可以获得比传统词计数方法更高的性能。在所提出的度量与传统无比对方法相比,在RF距离方面表现出更好的性能。