Suppr超能文献

基于二项式模型的高效序列比较度量及其应用。

An efficient binomial model-based measure for sequence comparison and its application.

机构信息

School of Science, Hangzhou Dianzi Unviersity, Hangzhou 310018, People's Republic of China.

出版信息

J Biomol Struct Dyn. 2011 Apr;28(5):833-43. doi: 10.1080/07391102.2011.10508611.

Abstract

Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.

摘要

序列比对是生物信息学中的主要任务之一,它可以作为结构和功能保守性以及进化关系的证据。有几种用于序列比较的相似性/相异性度量方法,但仍然存在挑战。本文提出了一种基于二项式模型的度量方法来分析生物序列。借助随机指标,可以将序列中任何位置的单词出现视为随机伯努利变量,并且众所周知,单词出现的和的分布是二项式的。通过使用递归公式,我们计算了单词计数的二项式概率,并基于相对熵提出了一种基于二项式模型的度量方法。通过包括 HEV 基因型分类和系统发育分析在内的广泛实验对所提出的度量方法进行了测试,并与基于比对和无比对的度量方法进行了进一步比较。结果表明,基于二项式模型的度量方法更为高效。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验