• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用随机树文法预测β-折叠区域的位置和结构。

Predicting location and structure of beta-sheet regions using stochastic tree grammars.

作者信息

Mamitsuka H, Abe N

机构信息

Theory NEC Laboratory, RWCP, Kawasaki, Japan.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1994;2:276-84.

PMID:7584401
Abstract

We describe and demonstrate the effectiveness of a method of predicting protein secondary structures, beta-sheet regions in particular, using a class of stochastic tree grammars as representational language for their amino acid sequence patterns. The family of stochastic tree grammars we use, the Stochastic Ranked Node Rewriting Grammars (SRNRG), is one of the rare families of stochastic grammars that are expressive enough to capture the kind of long-distance dependencies exhibited by the sequences of beta-sheet regions, and at the same time enjoy relatively efficient processing. We applied our method on real data obtained from the HSSP database and the results obtained are encouraging: Using an SRNRG trained by data of a particular protein, our method was actually able to predict the location and structure of beta-sheet regions in a number of different proteins, whose sequences are less than 25 per cent homologous to the training sequences. The learning algorithm we use is an extension of the 'Inside-Outside' algorithm for stochastic context free grammars, but with a number of significant modifications. First, we restricted the grammars used to be members of the 'linear' subclass of SRNRG, and devised simpler and faster algorithms for this subclass. Secondly, we reduced the alphabet size (i.e. the number of amino acids) by clustering them using their physicochemical properties, gradually through the iterations of the learning algorithm. Finally, we parallelized our parsing algorithm to run on a highly parallel computer, a 32-processor CM-5, and were able to obtain a nearly linear speed-up.(ABSTRACT TRUNCATED AT 250 WORDS)

摘要

我们描述并展示了一种预测蛋白质二级结构(特别是β折叠区域)的方法的有效性,该方法使用一类随机树文法作为其氨基酸序列模式的表示语言。我们使用的随机树文法家系,即随机排序节点重写文法(SRNRG),是少数能够充分表达β折叠区域序列所呈现的那种长距离依赖性,同时又具有相对高效处理能力的随机文法家系之一。我们将我们的方法应用于从HSSP数据库获得的真实数据,所得结果令人鼓舞:使用由特定蛋白质的数据训练的SRNRG,我们的方法实际上能够预测许多不同蛋白质中β折叠区域的位置和结构,这些蛋白质的序列与训练序列的同源性低于25%。我们使用的学习算法是随机上下文无关文法的“内外”算法的扩展,但有一些重大修改。首先,我们将使用的文法限制为SRNRG的“线性”子类的成员,并为该子类设计了更简单、更快的算法。其次,我们通过在学习算法的迭代过程中逐步根据氨基酸的物理化学性质对它们进行聚类,减小了字母表大小(即氨基酸数量)。最后,我们将解析算法并行化,以便在一台高度并行的计算机(一台32处理器的CM - 5)上运行,并能够获得近乎线性的加速比。(摘要截短为250字)

相似文献

1
Predicting location and structure of beta-sheet regions using stochastic tree grammars.使用随机树文法预测β-折叠区域的位置和结构。
Proc Int Conf Intell Syst Mol Biol. 1994;2:276-84.
2
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures.用于比对和预测假结RNA结构的成对随机树邻接文法
Proc IEEE Comput Syst Bioinform Conf. 2004:290-9.
3
A stochastic context free grammar based framework for analysis of protein sequences.基于随机上下文无关语法的蛋白质序列分析框架。
BMC Bioinformatics. 2009 Oct 8;10:323. doi: 10.1186/1471-2105-10-323.
4
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures.用于比对和预测假结RNA结构的配对随机树邻接文法
Bioinformatics. 2005 Jun 1;21(11):2611-7. doi: 10.1093/bioinformatics/bti385. Epub 2005 Mar 22.
5
Product Grammars for Alignment and Folding.用于比对和折叠的产物语法
IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):507-19. doi: 10.1109/TCBB.2014.2326155.
6
Stochastic context-free grammars for tRNA modeling.用于tRNA建模的随机上下文无关文法。
Nucleic Acids Res. 1994 Nov 25;22(23):5112-20. doi: 10.1093/nar/22.23.5112.
7
An optimized parsing algorithm well suited to RNA folding.
Proc Int Conf Intell Syst Mol Biol. 1995;3:222-30.
8
Corpus based learning of stochastic, context-free grammars combined with Hidden Markov Models for tRNA modelling.基于语料库的随机上下文无关语法学习与隐马尔可夫模型相结合用于tRNA建模。
Int J Bioinform Res Appl. 2005;1(3):305-18. doi: 10.1504/IJBRA.2005.007908.
9
Recognition of human genes by stochastic parsing.通过随机解析识别人类基因。
Pac Symp Biocomput. 1998:228-39.
10
Introduction to stochastic context free grammars.随机上下文无关文法简介。
Methods Mol Biol. 2014;1097:85-106. doi: 10.1007/978-1-62703-709-9_5.

引用本文的文献

1
Probabilistic grammatical model for helix-helix contact site classification.用于螺旋-螺旋接触位点分类的概率语法模型。
Algorithms Mol Biol. 2013 Dec 18;8(1):31. doi: 10.1186/1748-7188-8-31.
2
β-sheet topology prediction with high precision and recall for β and mixed α/β proteins.高精度和召回率的β-折叠拓扑结构预测,用于β和混合α/β 蛋白质。
PLoS One. 2012;7(3):e32461. doi: 10.1371/journal.pone.0032461. Epub 2012 Mar 9.
3
How many 3D structures do we need to train a predictor?我们需要训练一个预测器需要多少个 3D 结构?
Genomics Proteomics Bioinformatics. 2009 Sep;7(3):128-37. doi: 10.1016/S1672-0229(08)60041-8.
4
A stochastic context free grammar based framework for analysis of protein sequences.基于随机上下文无关语法的蛋白质序列分析框架。
BMC Bioinformatics. 2009 Oct 8;10:323. doi: 10.1186/1471-2105-10-323.