• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FASTA-SWAP和FASTA-PAT:使用比对氨基酸组合进行模式数据库搜索以及一种新颖的评分理论。

FASTA-SWAP and FASTA-PAT: pattern database searches using combinations of aligned amino acids, and a novel scoring theory.

作者信息

Ladunga I, Wiese B A, Smith R F

机构信息

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

出版信息

J Mol Biol. 1996 Jun 21;259(4):840-54. doi: 10.1006/jmbi.1996.0362.

DOI:10.1006/jmbi.1996.0362
PMID:8683587
Abstract

We introduce two new pattern database search tools that utilize statistical significance and information theory to improve protein function identification. Both the general pattern scoring theory with the specific matrices introduced here and the low redundancy of pattern databases increase search sensitivity and selectivity. Pattern scoring preferentially rewards matches at conserved positions in a pattern with higher scores than matches at variable positions, and assigns more negative scores to mismatches at conserved positions than to mismatches at variable positions. The theory of pattern scoring can be used to create log-odds pattern scores for patterns derived from any set of multiple alignments. This theoretical framework can be used to adapt existing sequence database search tools to pattern analysis. Our FASTA-SWAP and FASTA-PAT tools are extensions of the FASTA program that search a sequence query against a pattern database. In the first step, FASTA-SWAP searches the diagonals of the query sequence and the library pattern for high-scoring segments, while FASTA-PAT performs an extended version of hashing. In the second step, both methods refine the alignments and the scores using dynamic programming. The tools utilize an extremely compact binary representation of all possible combinations of amino acid residues in aligned positions. Our FASTA-SWAP and FASTA-PAT tools are well suited for functional identification of distant relatives that may be missed by sequence database search methods. FASTA-SWAP and FASTA-PAT searches can be performed using our World-Wide Web Server (http://dot.imgen.bcm.tmc.edu:9331/seq-search/Op tions/fastapat.html).

摘要

我们介绍了两种新的模式数据库搜索工具,它们利用统计学显著性和信息论来改进蛋白质功能识别。本文介绍的通用模式评分理论以及特定矩阵,再加上模式数据库的低冗余性,提高了搜索的灵敏度和选择性。模式评分优先奖励模式中保守位置的匹配,其得分高于可变位置的匹配,并且给保守位置的错配分配比可变位置的错配更多的负分数。模式评分理论可用于为从任何多序列比对集合中导出的模式创建对数似然模式分数。这个理论框架可用于使现有的序列数据库搜索工具适应模式分析。我们的FASTA-SWAP和FASTA-PAT工具是FASTA程序的扩展,用于在模式数据库中搜索序列查询。第一步,FASTA-SWAP在查询序列和库模式的对角线上搜索高分片段,而FASTA-PAT执行哈希的扩展版本。第二步,两种方法都使用动态规划来优化比对和分数。这些工具利用对齐位置中氨基酸残基所有可能组合的极其紧凑的二进制表示。我们的FASTA-SWAP和FASTA-PAT工具非常适合功能识别可能被序列数据库搜索方法遗漏的远亲。可以使用我们的万维网服务器(http://dot.imgen.bcm.tmc.edu:9331/seq-search/Op tions/fastapat.html)进行FASTA-SWAP和FASTA-PAT搜索。

相似文献

1
FASTA-SWAP and FASTA-PAT: pattern database searches using combinations of aligned amino acids, and a novel scoring theory.FASTA-SWAP和FASTA-PAT:使用比对氨基酸组合进行模式数据库搜索以及一种新颖的评分理论。
J Mol Biol. 1996 Jun 21;259(4):840-54. doi: 10.1006/jmbi.1996.0362.
2
Sensitivity and selectivity in protein similarity searches: a comparison of Smith-Waterman in hardware to BLAST and FASTA.蛋白质相似性搜索中的灵敏度与选择性:硬件实现的史密斯-沃特曼算法与BLAST和FASTA的比较
Genomics. 1996 Dec 1;38(2):179-91. doi: 10.1006/geno.1996.0614.
3
Further evaluation of the utility of "sliding window" FASTA in predicting cross-reactivity with allergenic proteins.“滑动窗口”FASTA在预测与变应原蛋白交叉反应性方面的效用的进一步评估。
Regul Toxicol Pharmacol. 2009 Aug;54(3 Suppl):S20-5. doi: 10.1016/j.yrtph.2008.11.006. Epub 2008 Dec 11.
4
PROMALS: towards accurate multiple sequence alignments of distantly related proteins.PROMALS:用于实现远缘相关蛋白质准确多序列比对
Bioinformatics. 2007 Apr 1;23(7):802-8. doi: 10.1093/bioinformatics/btm017. Epub 2007 Jan 31.
5
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.一种蛋白质序列与结构分析及建模的综合方法。III. 使用多重结构比对对蛋白质结构家族中的序列保守性进行比较研究。
J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975.
6
Database similarity searches.数据库相似性搜索。
Methods Mol Biol. 2008;484:361-78. doi: 10.1007/978-1-59745-398-1_24.
7
Incremental window-based protein sequence alignment algorithms.基于窗口递增的蛋白质序列比对算法。
Bioinformatics. 2007 Jan 15;23(2):e17-23. doi: 10.1093/bioinformatics/btl297.
8
Empirical statistical estimates for sequence similarity searches.序列相似性搜索的经验性统计估计。
J Mol Biol. 1998 Feb 13;276(1):71-84. doi: 10.1006/jmbi.1997.1525.
9
Protein family classification based on searching a database of blocks.基于搜索模块数据库的蛋白质家族分类。
Genomics. 1994 Jan 1;19(1):97-107. doi: 10.1006/geno.1994.1018.
10
A sequence property approach to searching protein databases.一种用于搜索蛋白质数据库的序列属性方法。
J Mol Biol. 1995 Aug 18;251(3):390-9. doi: 10.1006/jmbi.1995.0442.

引用本文的文献

1
B-vac a robust software package for bacterial vaccine design.B-vac是一个用于细菌疫苗设计的强大软件包。
Sci Rep. 2025 Aug 28;15(1):31745. doi: 10.1038/s41598-025-01201-0.