生物信息学中的语法推断

Grammatical inference in bioinformatics.

作者信息

Sakakibara Yasubumi

机构信息

Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1051-62. doi: 10.1109/TPAMI.2005.140.

DOI:10.1109/TPAMI.2005.140

PMID:16013753

Abstract

Bioinformatics is an active research area aimed at developing intelligent systems for analyses of molecular biology. Many methods based on formal language theory, statistical theory, and learning theory have been developed for modeling and analyzing biological sequences such as DNA, RNA, and proteins. Especially, grammatical inference methods are expected to find some grammatical structures hidden in biological sequences. In this article, we give an overview of a series of our grammatical approaches to biological sequence analyses and related researches and focus on learning stochastic grammars from biological sequences and predicting their functions based on learned stochastic grammars.

摘要

生物信息学是一个活跃的研究领域，旨在开发用于分子生物学分析的智能系统。已经开发了许多基于形式语言理论、统计理论和学习理论的方法，用于对DNA、RNA和蛋白质等生物序列进行建模和分析。特别是，语法推断方法有望发现隐藏在生物序列中的一些语法结构。在本文中，我们概述了我们对生物序列分析的一系列语法方法及相关研究，并重点介绍了从生物序列中学习随机语法以及基于所学随机语法预测其功能。

相似文献

Grammatical inference in bioinformatics.生物信息学中的语法推断

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1051-62. doi: 10.1109/TPAMI.2005.140.

Probabilistic finite-state machines--part I.概率有限状态机——第一部分。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1013-25. doi: 10.1109/TPAMI.2005.147.

Parsing with probabilistic strictly locally testable tree languages.使用概率严格局部可测试树语言进行解析。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1040-50. doi: 10.1109/TPAMI.2005.144.

Probabilistic finite-state machines--part II.概率有限状态机——第二部分。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1026-39. doi: 10.1109/TPAMI.2005.148.

Learning deterministic finite automata with a smart state labeling evolutionary algorithm.使用智能状态标记进化算法学习确定性有限自动机。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1063-74. doi: 10.1109/TPAMI.2005.143.

Structural semantic interconnections: a knowledge-based approach to word sense disambiguation.结构语义互连：一种基于知识的词义消歧方法。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1075-86. doi: 10.1109/TPAMI.2005.149.

Guest editors' introduction to the special section on syntactic and structural pattern recognition.特邀编辑对句法与结构模式识别专题的介绍。

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1009-12. doi: 10.1109/TPAMI.2005.141.

A new distance measure for model-based sequence clustering.一种用于基于模型的序列聚类的新距离度量。

IEEE Trans Pattern Anal Mach Intell. 2009 Jul;31(7):1325-31. doi: 10.1109/TPAMI.2008.268.

Online clustering algorithms for radar emitter classification.用于雷达辐射源分类的在线聚类算法

IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1185-96. doi: 10.1109/TPAMI.2005.166.

Onvergence and application of online active sampling using orthogonal pillar vectors.使用正交柱向量的在线主动采样的收敛性与应用

IEEE Trans Pattern Anal Mach Intell. 2004 Sep;26(9):1197-207. doi: 10.1109/TPAMI.2004.61.

引用本文的文献

Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides.一种新型语法推理方法在淀粉样六肽分类中的应用。

Comput Math Methods Med. 2016;2016:1782732. doi: 10.1155/2016/1782732. Epub 2016 Mar 9.

A grammar inference approach for predicting kinase specific phosphorylation sites.一种用于预测激酶特异性磷酸化位点的语法推理方法。

PLoS One. 2015 Apr 17;10(4):e0122294. doi: 10.1371/journal.pone.0122294. eCollection 2015.

Lineage grammars: describing, simulating and analyzing population dynamics.谱系语法：描述、模拟和分析种群动态。

BMC Bioinformatics. 2014 Jul 21;15(1):249. doi: 10.1186/1471-2105-15-249.

Probabilistic grammatical model for helix-helix contact site classification.用于螺旋-螺旋接触位点分类的概率语法模型。

Algorithms Mol Biol. 2013 Dec 18;8(1):31. doi: 10.1186/1748-7188-8-31.

A composite method based on formal grammar and DNA structural features in detecting human polymerase II promoter region.基于形式语法和 DNA 结构特征的复合方法检测人类聚合酶 II 启动子区域。

PLoS One. 2013;8(2):e54843. doi: 10.1371/journal.pone.0054843. Epub 2013 Feb 20.

Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms.

Genomics Inform. 2012 Dec;10(4):266-70. doi: 10.5808/GI.2012.10.4.266. Epub 2012 Dec 31.

Peptide vocabulary analysis reveals ultra-conservation and homonymity in protein sequences.肽词汇分析揭示了蛋白质序列中的超保守性和同音性。

Bioinform Biol Insights. 2009 Nov 24;1:101-26. doi: 10.4137/bbi.s415.

A stochastic context free grammar based framework for analysis of protein sequences.基于随机上下文无关语法的蛋白质序列分析框架。

BMC Bioinformatics. 2009 Oct 8;10:323. doi: 10.1186/1471-2105-10-323.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生物信息学中的语法推断

Grammatical inference in bioinformatics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献