Suppr超能文献

序列的启发式信息分析

Heuristic informational analysis of sequences.

作者信息

Claverie J M, Bougueleret L

出版信息

Nucleic Acids Res. 1986 Jan 10;14(1):179-96. doi: 10.1093/nar/14.1.179.

Abstract

Nucleotide or amino-acid sequences are interpreted as successions of words of length k (k-tuples) the frequencies of which are highly variable in different statistical populations of genes or proteins. After building k-tuple reference tables from coherent subsets or entire data banks, the local information content profile of individual sequences is drawn. Anomalous regions (peaks or depressions) of such a profile can lead to the discovery and identification of specific sequence patterns. Along the same principle, the simultaneous use of two reference statistical populations and the computation of an index combining the two information profiles lead to a general and powerful discriminant analysis methods. The identification of a "signal" associated with gene conversion, the introns/exons discrimination and the location of function specific patterns in proteins are given as examples of successful applications of this heuristic informational approach.

摘要

核苷酸或氨基酸序列被解释为长度为k(k元组)的单词序列,其频率在不同的基因或蛋白质统计群体中高度可变。在从连贯子集或整个数据库构建k元组参考表之后,绘制单个序列的局部信息含量图谱。这种图谱的异常区域(峰值或凹陷)可导致发现和识别特定的序列模式。基于相同的原理,同时使用两个参考统计群体并计算结合两个信息图谱的指数,可得出一种通用且强大的判别分析方法。与基因转换相关的“信号”识别、内含子/外显子区分以及蛋白质中功能特异性模式的定位,均作为这种启发式信息方法成功应用的示例给出。

相似文献

1
Heuristic informational analysis of sequences.序列的启发式信息分析
Nucleic Acids Res. 1986 Jan 10;14(1):179-96. doi: 10.1093/nar/14.1.179.
2
Rapid searches for complex patterns in biological molecules.快速搜索生物分子中的复杂模式。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):263-80. doi: 10.1093/nar/12.1part1.263.
8
MULTAN: a program to align multiple DNA sequences.MULTAN:一个用于比对多个DNA序列的程序。
Nucleic Acids Res. 1986 Jan 10;14(1):159-77. doi: 10.1093/nar/14.1.159.

引用本文的文献

8
Computational methods for exon detection.外显子检测的计算方法。
Mol Biotechnol. 1998 Aug;10(1):27-48. doi: 10.1007/BF02745861.
9
Self-identification of protein-coding regions in microbial genomes.微生物基因组中蛋白质编码区域的自我识别。
Proc Natl Acad Sci U S A. 1998 Aug 18;95(17):10026-31. doi: 10.1073/pnas.95.17.10026.

本文引用的文献

2
New approaches for computer analysis of nucleic acid sequences.核酸序列计算机分析的新方法。
Proc Natl Acad Sci U S A. 1983 Sep;80(18):5660-4. doi: 10.1073/pnas.80.18.5660.
4
BIOLOG - a DNA sequence analysis system in PROLOG.BIOLOG——一种用PROLOG语言编写的DNA序列分析系统。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):633-42. doi: 10.1093/nar/12.1part2.633.
7
Fast computer search for similar DNA sequences.利用计算机快速搜索相似的DNA序列。
Nucleic Acids Res. 1984 Jul 11;12(13):5471-4. doi: 10.1093/nar/12.13.5471.
9
Graphic methods to determine the function of nucleic acid sequences.用于确定核酸序列功能的图解方法。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):521-38. doi: 10.1093/nar/12.1part2.521.
10
Protein and Nucleic Acid Sequence Database Systems.蛋白质和核酸序列数据库系统
Annu Rev Biophys Bioeng. 1983;12:419-41. doi: 10.1146/annurev.bb.12.060183.002223.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验