Suppr超能文献

基于重复k字分布的各种基因组序列分类

Classification of various genomic sequences based on distribution of repeated k-word.

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:3894-3897. doi: 10.1109/EMBC.2017.8037707.

Abstract

In order to extract phylogenetic information from DNA sequences, alignment-free methods and alignment-based methods are used. Alignment-based methods have high complexity and conventional alignment-free methods have low accuracy. In this paper, a new alignment-free method based on the distribution of repeated k-word measure is proposed. This novel measure is based on k-words and its multiple repeated words. We can get higher performance than conventional word count methods in case of using proposed scheme while maintaining total time complexity. The proposed measure shows better performance compared to conventional alignment-free methods with respect to RF distance.

摘要

为了从DNA序列中提取系统发育信息,人们使用了无比对方法和基于比对的方法。基于比对的方法具有高复杂性,而传统的无比对方法准确性较低。本文提出了一种基于重复k字度量分布的新的无比对方法。这种新颖的度量基于k字及其多个重复词。在保持总时间复杂度的情况下,使用所提出的方案时,我们可以获得比传统词计数方法更高的性能。在所提出的度量与传统无比对方法相比,在RF距离方面表现出更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验