Suppr超能文献

将敏感的数据库搜索与多个中间步骤相结合以检测远缘同源物。

Combining sensitive database searches with multiple intermediates to detect distant homologues.

作者信息

Salamov A A, Suwa M, Orengo C A, Swindells M B

机构信息

Helix Research Institute, Chiba, Japan.

出版信息

Protein Eng. 1999 Feb;12(2):95-100. doi: 10.1093/protein/12.2.95.

Abstract

Using data from the CATH structure classification, we have assessed the blastp, fasta, smith-waterman and gapped-blast algorithms, developed a portable normalization scheme and identified safe thresholds for database searching. Of the four methods assessed, fasta, smith-waterman and gapped-blast perform similarly, whereas the sensitivity of blastp was much lower. Introduction of an intermediate sequence search substantially improved the results. When tested on a set of relationships that could not be identified by blastp, intermediate sequences were able to find double the number of relationships identified by the smith-waterman algorithm alone. However, we found that the benefit of using intermediates varied considerably between each family and depended not only on the number of available sequences, but also their diversity. In an attempt to increase sensitivity further, a multiple intermediate sequence search (MISS) procedure was developed. When assessed on 1906 cases from a wide range of homologous families that could not be detected by the previous approaches, MISS was able to identify 241 additional relationships. MISS uses the full extent of sequence diversity to detect additional relationships, but does not consider any structure-specific information. For this reason, it is more generally applicable than fold recognition and threading methods, which require a library of known structures.

摘要

利用CATH结构分类的数据,我们评估了blastp、fasta、史密斯-沃特曼和空位比对算法,开发了一种便携式归一化方案,并确定了数据库搜索的安全阈值。在所评估的四种方法中,fasta、史密斯-沃特曼和空位比对的性能相似,而blastp的灵敏度则低得多。引入中间序列搜索显著改善了结果。当在一组blastp无法识别的关系上进行测试时,中间序列能够找到的关系数量是仅使用史密斯-沃特曼算法所识别关系数量的两倍。然而,我们发现使用中间序列的益处因每个家族而异,不仅取决于可用序列的数量,还取决于它们的多样性。为了进一步提高灵敏度,我们开发了一种多重中间序列搜索(MISS)程序。当在1906个来自广泛同源家族且先前方法无法检测到的案例上进行评估时,MISS能够识别出另外241种关系。MISS利用序列多样性的全部范围来检测额外的关系,但不考虑任何结构特异性信息。因此,它比需要已知结构库的折叠识别和穿线方法更具普遍适用性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验