• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于快速灵敏蛋白质家族搜索的基序识别神经设计

Motif identification neural design for rapid and sensitive protein family search.

作者信息

Wu C H, Zhao S, Chen H L, Lo C J, McLarty J

机构信息

Department of Epidemiology/Biomathematics, University of Texas Health Center at Tyler 75710, USA.

出版信息

Comput Appl Biosci. 1996 Apr;12(2):109-18. doi: 10.1093/bioinformatics/12.2.109.

DOI:10.1093/bioinformatics/12.2.109
PMID:8744773
Abstract

A new method, the motif identification neural design (MOTIFIND), has been developed for rapid and sensitive protein family identification. The method is an extension of our previous gene classification artificial neural system and employs new designs to enhance the detection of distant relationships. The new designs include an n-gram term weighting algorithm for extracting local motif patterns, an enhanced n-gram method for extracting residues of long-range correlation, and integrated neural networks for combining global and motif sequence information. The system has been tested and compared with several existing methods using three protein families, the cytochrome c, cytochrome b and flavodoxin. Overall it achieves 100% sensitivity and > 99.6% specificity, an accuracy comparable to BLAST, but at a speed of approximately 20 times faster. The system is much more robust than the PROSITE search which is based on simple signature patterns. MOTIFIND also compares favorably with BLIMPS, the Hidden Markov Model and PROFILESEARCH in detecting fragmentary sequences lacking complete motif regions and in detecting distant relationships, especially for members of under-represented subgroups within a family. MOTIFIND may be generally applicable to other proteins and has the potential to become a full-scale database search and sequence analysis tool.

摘要

一种名为基序识别神经设计(MOTIFIND)的新方法已被开发出来,用于快速且灵敏地识别蛋白质家族。该方法是我们之前基因分类人工神经系统的扩展,并采用了新设计来加强对远亲关系的检测。新设计包括用于提取局部基序模式的n元语法词加权算法、用于提取长程相关性残基的增强型n元语法方法,以及用于结合全局和基序序列信息的集成神经网络。该系统已使用细胞色素c、细胞色素b和黄素氧还蛋白这三个蛋白质家族进行了测试,并与几种现有方法进行了比较。总体而言,它实现了100%的灵敏度和>99.6%的特异性,准确性与BLAST相当,但速度快约20倍。该系统比基于简单签名模式的PROSITE搜索更稳健。在检测缺乏完整基序区域的片段序列以及检测远亲关系方面,尤其是对于家族中代表性不足的亚组成员,MOTIFIND也优于BLIMPS、隐马尔可夫模型和PROFILESEARCH。MOTIFIND可能普遍适用于其他蛋白质,并有可能成为一个全面的数据库搜索和序列分析工具。

相似文献

1
Motif identification neural design for rapid and sensitive protein family search.用于快速灵敏蛋白质家族搜索的基序识别神经设计
Comput Appl Biosci. 1996 Apr;12(2):109-18. doi: 10.1093/bioinformatics/12.2.109.
2
Motif identification neural design for rapid and sensitive protein family search.用于快速灵敏蛋白质家族搜索的基序识别神经设计
Pac Symp Biocomput. 1996:674-85.
3
Clustering proteins into families using artificial neural networks.使用人工神经网络将蛋白质聚类成家族。
Comput Appl Biosci. 1992 Feb;8(1):39-44. doi: 10.1093/bioinformatics/8.1.39.
4
A simple method for aligning many protein sequences.一种比对多个蛋白质序列的简单方法。
J Chem Inf Comput Sci. 2001 Mar-Apr;41(2):278-80. doi: 10.1021/ci9904362.
5
ProClass Protein Family Database.专业蛋白质家族数据库
Nucleic Acids Res. 1999 Jan 1;27(1):272-4. doi: 10.1093/nar/27.1.272.
6
Post-processing of BLAST results using databases of clustered sequences.使用聚类序列数据库对BLAST结果进行后处理。
Comput Appl Biosci. 1997 Feb;13(1):81-7. doi: 10.1093/bioinformatics/13.1.81.
7
Hidden Markov models for detecting remote protein homologies.用于检测远程蛋白质同源性的隐马尔可夫模型。
Bioinformatics. 1998;14(10):846-56. doi: 10.1093/bioinformatics/14.10.846.
8
Multiple alignment of sequences on parallel computers.在并行计算机上进行序列的多重比对。
Comput Appl Biosci. 1993 Aug;9(4):397-402. doi: 10.1093/bioinformatics/9.4.397.
9
Searching for distantly related protein sequences in large databases by parallel processing on a transputer machine.通过在一台晶片机上进行并行处理,在大型数据库中搜索远缘相关的蛋白质序列。
Comput Appl Biosci. 1992 Feb;8(1):49-55. doi: 10.1093/bioinformatics/8.1.49.
10
Recognition of distantly related protein sequences using conserved motifs and neural networks.利用保守基序和神经网络识别远亲蛋白质序列。
J Mol Biol. 1992 Dec 5;228(3):951-62. doi: 10.1016/0022-2836(92)90877-m.

引用本文的文献

1
Computational analysis and modeling of cleavage by the immunoproteasome and the constitutive proteasome.免疫蛋白酶体和组成型蛋白酶体切割的计算分析和建模。
BMC Bioinformatics. 2010 Sep 23;11:479. doi: 10.1186/1471-2105-11-479.
2
Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles.利用图谱预测肽与MHC分子结合的RANKPEP资源的增强。
Immunogenetics. 2004 Sep;56(6):405-19. doi: 10.1007/s00251-004-0709-7. Epub 2004 Sep 3.
3
Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions.
基于n肽组成的支持向量机预测革兰氏阴性菌蛋白质的亚细胞定位
Protein Sci. 2004 May;13(5):1402-6. doi: 10.1110/ps.03479604.
4
Sensitivity and selectivity in protein structure comparison.蛋白质结构比较中的敏感性和选择性。
Protein Sci. 2004 Mar;13(3):773-85. doi: 10.1110/ps.03328504.
5
Self-organizing tree-growing network for the classification of protein sequences.用于蛋白质序列分类的自组织树生长网络
Protein Sci. 1998 Dec;7(12):2613-22. doi: 10.1002/pro.5560071215.