基于二项式模型的高效序列比较度量及其应用。

An efficient binomial model-based measure for sequence comparison and its application.

机构信息

School of Science, Hangzhou Dianzi Unviersity, Hangzhou 310018, People's Republic of China.

出版信息

J Biomol Struct Dyn. 2011 Apr;28(5):833-43. doi: 10.1080/07391102.2011.10508611.

DOI:10.1080/07391102.2011.10508611

Abstract

Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.

摘要

序列比对是生物信息学中的主要任务之一，它可以作为结构和功能保守性以及进化关系的证据。有几种用于序列比较的相似性/相异性度量方法，但仍然存在挑战。本文提出了一种基于二项式模型的度量方法来分析生物序列。借助随机指标，可以将序列中任何位置的单词出现视为随机伯努利变量，并且众所周知，单词出现的和的分布是二项式的。通过使用递归公式，我们计算了单词计数的二项式概率，并基于相对熵提出了一种基于二项式模型的度量方法。通过包括 HEV 基因型分类和系统发育分析在内的广泛实验对所提出的度量方法进行了测试，并与基于比对和无比对的度量方法进行了进一步比较。结果表明，基于二项式模型的度量方法更为高效。

相似文献

An efficient binomial model-based measure for sequence comparison and its application.基于二项式模型的高效序列比较度量及其应用。

J Biomol Struct Dyn. 2011 Apr;28(5):833-43. doi: 10.1080/07391102.2011.10508611.

Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison.词汇频率的数值特征及其在序列比较相似度度量中的应用。

J Theor Biol. 2011 May 7;276(1):174-80. doi: 10.1016/j.jtbi.2011.02.005. Epub 2011 Feb 18.

A novel statistical measure for sequence comparison on the basis of k-word counts.基于 k 字计数的序列比较的一种新的统计度量。

J Theor Biol. 2013 Feb 7;318:91-100. doi: 10.1016/j.jtbi.2012.10.035. Epub 2012 Nov 9.

Weighted relative entropy for alignment-free sequence comparison based on Markov model.基于马尔可夫模型的无比对序列比对的加权相对熵。

J Biomol Struct Dyn. 2011 Feb;28(4):545-55. doi: 10.1080/07391102.2011.10508594.

Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.马尔可夫模型加k词分布：一种产生用于序列比较的新型统计量度的协同作用。

Bioinformatics. 2008 Oct 15;24(20):2296-302. doi: 10.1093/bioinformatics/btn436. Epub 2008 Aug 18.

Alignment free comparison: similarity distribution between the DNA primary sequences based on the shortest absent word.无比对：基于最短缺失字的 DNA 一级序列相似性分布。

J Theor Biol. 2012 Feb 21;295:125-31. doi: 10.1016/j.jtbi.2011.11.021. Epub 2011 Dec 1.

A score method for comparison of partial genomic regions in their representatives of full-length genome of hepatitis E virus for genotyping.一种用于在戊型肝炎病毒全长基因组代表物中比较部分基因组区域以进行基因分型的评分方法。

Intervirology. 2007;50(5):328-35. doi: 10.1159/000106805. Epub 2007 Aug 7.

A measure of DNA sequence dissimilarity based on free energy of nearest-neighbor interaction.基于最近邻相互作用自由能的 DNA 序列差异度量。

J Biomol Struct Dyn. 2011 Feb;28(4):557-65. doi: 10.1080/07391102.2011.10508595.

Using Gaussian model to improve biological sequence comparison.利用高斯模型改进生物序列比较。

J Comput Chem. 2010 Jan 30;31(2):351-61. doi: 10.1002/jcc.21322.

The Burrows-Wheeler similarity distribution between biological sequences based on Burrows-Wheeler transform.基于 Burrows-Wheeler 变换的生物序列的 Burrows-Wheeler 相似性分布。

J Theor Biol. 2010 Feb 21;262(4):742-9. doi: 10.1016/j.jtbi.2009.10.033. Epub 2009 Nov 10.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于二项式模型的高效序列比较度量及其应用。

An efficient binomial model-based measure for sequence comparison and its application.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献