Suppr超能文献

蛋白质或DNA比对序列的最大熵加权

Maximum entropy weighting of aligned sequences of proteins or DNA.

作者信息

Krogh A, Mitchison G

机构信息

Laboratory of Molecular Biology, Cambridge, England.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1995;3:215-21.

PMID:7584440
Abstract

In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with profiles or other consensus models based on multiple alignments. A new weighting scheme for this type of database search is proposed. In a statistical description of the searching problem it is derived from the maximum entropy principle. It can be proved that, in a certain sense, it corrects for uneven representation. It is shown that finding the maximum entropy weights is an easy optimization problem for which standard techniques are applicable.

摘要

在蛋白质家族或其他生物序列(如DNA)中,各个亚家族的代表性往往极不均衡。因此,为每个序列赋予权重的方案能够显著提升基于多序列比对的数据库搜索任务的性能,比如使用轮廓或其他一致模型进行搜索。本文提出了一种适用于此类数据库搜索的新权重方案。在对搜索问题的统计描述中,该方案源自最大熵原理。可以证明,在某种意义上,它能校正不均衡的代表性。结果表明,寻找最大熵权重是一个易于求解的优化问题,可应用标准技术来解决。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验