Suppr超能文献

基于熵轮廓的序列分析高效算法。

Efficient Algorithms for Sequence Analysis with Entropic Profiles.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):117-128. doi: 10.1109/TCBB.2016.2620143. Epub 2016 Oct 21.

Abstract

Entropy, being closely related to repetitiveness and compressibility, is a widely used information-related measure to assess the degree of predictability of a sequence. Entropic profiles are based on information theory principles, and can be used to study the under-/over-representation of subwords, by also providing information about the scale of conserved DNA regions. Here, we focus on the algorithmic aspects related to entropic profiles. In particular, we propose linear time algorithms for their computation that rely on suffix-based data structures, more specifically on the truncated suffix tree (TST) and on the enhanced suffix array (ESA). We performed an extensive experimental campaign showing that our algorithms, beside being faster, make it possible the analysis of longer sequences, even for high degrees of resolution, than state of the art algorithms.

摘要

熵与重复性和可压缩性密切相关,是一种广泛用于评估序列可预测程度的与信息相关的度量。熵谱基于信息论原理,可用于研究子词的欠/过表示,同时还提供有关保守 DNA 区域规模的信息。在这里,我们专注于与熵谱相关的算法方面。特别是,我们提出了基于后缀的基于数据结构的线性时间算法,更具体地说,是基于截断后缀树(TST)和增强后缀数组(ESA)。我们进行了广泛的实验活动,表明我们的算法不仅更快,而且可以分析更长的序列,即使对于更高的分辨率,也比最先进的算法能够实现。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验