Suppr超能文献

Identification of sequence motifs from a set of proteins with related function.

作者信息

Saqi M A, Sternberg M J

机构信息

Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, London, UK.

出版信息

Protein Eng. 1994 Feb;7(2):165-71. doi: 10.1093/protein/7.2.165.

Abstract

The automatic identification of motifs associated with a given function is an important challenge for molecular sequence analysis. A method is presented for the extraction of such patterns from large sets of unaligned sequences with related but general function, for example, a set of heat shock proteins. In such a set of proteins there can often be several subfamilies each characterized by one or more distinct motifs. The aim is to develop computational tools to identify these motifs. The algorithm presented locates high frequency words of length k with a given number of positions, r, fixed. Statistics for a binomial distribution are used to assess the significance of the words. The high-frequency words are clustered and highly populated clusters retained. The composition of the clusters is displayed graphically. A set of motifs associated with the sequence family can automatically be extracted. The method is benchmarked on a set of 106 heat shock sequences and a set of 257 toxin sequences. It is shown to recover previously identified motifs.

摘要

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验