Suppr超能文献

混沌游戏表示法的数学特征。核苷酸序列分析的新算法。

Mathematical characterization of Chaos Game Representation. New algorithms for nucleotide sequence analysis.

作者信息

Dutta C, Das J

机构信息

Biophysics Division, Indian Institute of Chemical Biology, Calcutta.

出版信息

J Mol Biol. 1992 Dec 5;228(3):715-9. doi: 10.1016/0022-2836(92)90857-g.

Abstract

Chaos Game Representation (CGR) can recognize patterns in the nucleotide sequences, obtained from databases, of a class of genes using the techniques of fractal structures and by considering DNA sequences as strings composed of four units, G, A, T and C. Such recognition of patterns relies only on visual identification and no mathematical characterization of CGR is known. The present report describes two algorithms that can predict the presence or absence of a stretch of nucleotides in any gene family. The first algorithm can be used to generate DNA sequences represented by any point in the CGR. The second algorithm can simulate known CGR patterns for different gene families by setting the probabilities of occurrence of different di- or trinucleotides by a trial and error process using some guidelines and approximate rules-of-thumb. The validity of the second algorithm has been tested by simulating sequences that can mimic the CGRs of vertebrate non-oncogenes, proto-oncogenes and oncogenes. These algorithms can provide a mathematical basis of the CGR patterns obtained using nucleotide sequences from databases.

摘要

混沌游戏表示法(CGR)能够利用分形结构技术,并将DNA序列视为由G、A、T和C四个单元组成的字符串,来识别从数据库中获取的一类基因的核苷酸序列中的模式。这种模式识别仅依赖于视觉识别,目前尚不清楚CGR的数学特征。本报告描述了两种算法,它们可以预测任何基因家族中一段核苷酸序列的存在与否。第一种算法可用于生成由CGR中任何点表示的DNA序列。第二种算法可以通过使用一些指导方针和近似经验法则,通过反复试验过程设置不同二核苷酸或三核苷酸的出现概率,来模拟不同基因家族的已知CGR模式。通过模拟可以模仿脊椎动物非癌基因、原癌基因和癌基因的CGR的序列,对第二种算法的有效性进行了测试。这些算法可以为使用数据库中的核苷酸序列获得的CGR模式提供数学基础。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验