• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MAGIIC-PRO:通过高效发现蛋白质序列中的长模式来检测功能特征。

MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.

作者信息

Hsu Chen-Ming, Chen Chien-Yu, Liu Baw-Jhiune

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 320, Taiwan, Republic of China.

出版信息

Nucleic Acids Res. 2008 Mar;36(4):1400-6. doi: 10.1093/nar/gkm717.

DOI:10.1093/nar/gkm717
PMID:18314547
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3143912/
Abstract

This paper presents a web service named MAGIICPRO,which aims to discover functional signatures of a query protein by sequential pattern mining. Automatic discovery of patterns from unaligned biological sequences is an important problem in molecular biology. MAGIIC-PRO is different from several previously established methods performing similar tasks in two major ways. The first remarkable feature of MAGIIC-PRO is its efficiency in delivering long patterns. With incorporating a new type of gap constraints and some of the state-of-theart data mining techniques, MAGIIC-PRO usually identifies satisfied patterns within an acceptable response time. The efficiency of MAGIIC-PRO enables the users to quickly discover functional signatures of which the residues are not from only one region of the protein sequences or are only conserved in few members of a protein family. The second remarkable feature of MAGIIC-PRO is its effort in refining the mining results. Considering large flexible gaps improves the completeness of the derived functional signatures. The users can be directly guided to the patterns with as many blocks as that are conserved simultaneously. In this paper,we show by experiments that MAGIIC-PRO is efficient and effective in identifying ligand-binding sites and hot regions in protein-protein interactions directly from sequences. The web service is availableat http://biominer.bime.ntu.edu.tw/magiicproand a mirror site at http://biominer.cse.yzu.edu.tw/magiicpro.

摘要

本文介绍了一种名为MAGIICPRO的网络服务,其旨在通过序列模式挖掘来发现查询蛋白质的功能特征。从未比对的生物序列中自动发现模式是分子生物学中的一个重要问题。MAGIIC-PRO在两个主要方面与之前建立的几种执行类似任务的方法不同。MAGIIC-PRO的第一个显著特征是其在生成长模式方面的效率。通过纳入一种新型的间隙约束和一些最先进的数据挖掘技术,MAGIIC-PRO通常能在可接受的响应时间内识别出满足条件的模式。MAGIIC-PRO的效率使用户能够快速发现其残基并非仅来自蛋白质序列的一个区域或仅在蛋白质家族的少数成员中保守的功能特征。MAGIIC-PRO的第二个显著特征是其在完善挖掘结果方面所做的努力。考虑到大的灵活间隙可提高所推导功能特征的完整性。用户可以直接被引导至具有同时保守的多个模块的模式。在本文中,我们通过实验表明,MAGIIC-PRO在直接从序列中识别蛋白质-蛋白质相互作用中的配体结合位点和热点区域方面是高效且有效的。该网络服务可在http://biominer.bime.ntu.edu.tw/magiicpro获取,其镜像站点为http://biominer.cse.yzu.edu.tw/magiicpro。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/e6864fa97c34/gkm717f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/7e3d4107a65f/gkm717f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/519b03054c8e/gkm717f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/e6864fa97c34/gkm717f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/7e3d4107a65f/gkm717f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/519b03054c8e/gkm717f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e86c/3143912/e6864fa97c34/gkm717f3.jpg

相似文献

1
MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.MAGIIC-PRO:通过高效发现蛋白质序列中的长模式来检测功能特征。
Nucleic Acids Res. 2008 Mar;36(4):1400-6. doi: 10.1093/nar/gkm717.
2
MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.MAGIIC-PRO:通过高效发现蛋白质序列中的长模式来检测功能特征。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W356-61. doi: 10.1093/nar/gkl309.
3
WildSpan: mining structured motifs from protein sequences.WildSpan:从蛋白质序列中挖掘结构化基序
Algorithms Mol Biol. 2011 Mar 31;6(1):6. doi: 10.1186/1748-7188-6-6.
4
Identification of hot regions in protein-protein interactions by sequential pattern mining.通过序列模式挖掘识别蛋白质-蛋白质相互作用中的热点区域
BMC Bioinformatics. 2007 May 24;8 Suppl 5(Suppl 5):S8. doi: 10.1186/1471-2105-8-S5-S8.
5
iPDA: integrated protein disorder analyzer.iPDA:整合蛋白质无序分析器。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W465-72. doi: 10.1093/nar/gkm353. Epub 2007 Jun 6.
6
E1DS: catalytic site prediction based on 1D signatures of concurrent conservation.E1DS:基于并发保守性的一维特征进行催化位点预测。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W291-6. doi: 10.1093/nar/gkn324. Epub 2008 Jun 4.
7
seeMotif: exploring and visualizing sequence motifs in 3D structures.seeMotif:探索和可视化三维结构中的序列基序
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W552-8. doi: 10.1093/nar/gkp439. Epub 2009 May 28.
8
PCFamily: a web server for searching homologous protein complexes.PCFamily:一个用于搜索同源蛋白复合物的网络服务器。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W516-22. doi: 10.1093/nar/gkq464. Epub 2010 May 28.
9
Protein structure topological comparison, discovery and matching service.蛋白质结构拓扑比较、发现与匹配服务。
Bioinformatics. 2005 May 15;21(10):2537-8. doi: 10.1093/bioinformatics/bti331. Epub 2005 Mar 1.
10
Efficient recognition of folds in protein 3D structures by the improved PRIDE algorithm.通过改进的PRIDE算法高效识别蛋白质三维结构中的折叠。
Bioinformatics. 2005 Aug 1;21(15):3322-3. doi: 10.1093/bioinformatics/bti513. Epub 2005 May 24.

本文引用的文献

1
The PROSITE database.PROSITE数据库。
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D227-30. doi: 10.1093/nar/gkj063.
2
PRISM: protein interactions by structural matching.PRISM:通过结构匹配进行蛋白质相互作用
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W331-6. doi: 10.1093/nar/gki585.
3
SCANMOT: searching for similar sequences using a simultaneous scan of multiple sequence motifs.SCANMOT:通过同时扫描多个序列基序来搜索相似序列。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W274-6. doi: 10.1093/nar/gki493.
4
QuasiMotiFinder: protein annotation by searching for evolutionarily conserved motif-like patterns.准运动发现器:通过搜索进化上保守的基序样模式进行蛋白质注释。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W255-61. doi: 10.1093/nar/gki496.
5
Hot regions in protein--protein interactions: the organization and contribution of structurally conserved hot spot residues.蛋白质-蛋白质相互作用中的热点区域:结构保守热点残基的组织与贡献
J Mol Biol. 2005 Feb 4;345(5):1281-94. doi: 10.1016/j.jmb.2004.10.077. Epub 2004 Dec 2.
6
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity.eBLOCKs:枚举保守蛋白质模块以实现最大灵敏度和特异性。
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D178-82. doi: 10.1093/nar/gki060.
7
Greedy mixture learning for multiple motif discovery in biological sequences.用于生物序列中多个基序发现的贪婪混合学习
Bioinformatics. 2003 Mar 22;19(5):607-17. doi: 10.1093/bioinformatics/btg037.
8
Mining protein sequences for motifs.挖掘蛋白质序列中的基序。
J Comput Biol. 2002;9(5):707-20. doi: 10.1089/106652702761034145.
9
SPLASH: structural pattern localization analysis by sequential histograms.SPLASH:通过序列直方图进行的结构模式定位分析
Bioinformatics. 2000 Apr;16(4):341-57. doi: 10.1093/bioinformatics/16.4.341.
10
Approaches to the automatic discovery of patterns in biosequences.生物序列模式自动发现的方法。
J Comput Biol. 1998 Summer;5(2):279-305. doi: 10.1089/cmb.1998.5.279.