• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

不同基因组中密码子频率的分布系列的一般规则。

A general rule for ranged series of codon frequencies in different genomes.

作者信息

Gusein-Zade S M

机构信息

Institute of Molecular Genetics, USSR Academy of Sciences, Moscow.

出版信息

J Biomol Struct Dyn. 1989 Apr;6(5):1001-12. doi: 10.1080/07391102.1989.10506527.

DOI:10.1080/07391102.1989.10506527
PMID:2556159
Abstract

Information science widely uses descriptions of the distribution of information units (words) according to the frequency of occurrence with the help of a corresponding ranged series, i.e., the sequence of occurrence frequencies p1, p2, ..., pr as taken in decreasing order. A model called the Zipf rule or Zipflaw is the most commonly used. In this model pr is inversly proportional to a certain degree of range r: pr = C/r2 (C, z greater than 0). Upon analysis, the correspondence of codon distribution and the Zipf model is found unsatisfactory. The distribution of letters (in English and some other languages) by the occurrence frequency does not obey the Zipf rule either. A new model is proposed for a similar distribution in which pr = C.(ln(n + 1)-ln r), where n is the quantity of various symbols (codons). This dependence is approximated by a straight line not in the co-ordinate system (ln r, ln p), like the Zipf model, but in the (ln r, p) system of co-ordinates. It is shown on the basis of statistical criteria that this model is in good agreement with the ranged series of codon frequencies for the best-studied genoms to date. This result may be regarded as an additional reason in favor of the codon-letter analogy (not the codon-word analogy) in genetic texts.

摘要

信息科学在相应的区间序列的帮助下,广泛使用根据出现频率对信息单元(单词)分布的描述,即按降序排列的出现频率序列p1、p2、...、pr。一种称为齐普夫规则或齐普夫定律的模型是最常用的。在该模型中,pr与某个区间r的一定程度成反比:pr = C/r²(C、r大于0)。经过分析,发现密码子分布与齐普夫模型的对应关系并不理想。字母(在英语和其他一些语言中)按出现频率的分布也不遵循齐普夫规则。针对类似的分布提出了一种新模型,其中pr = C·(ln(n + 1) - ln r),其中n是各种符号(密码子)的数量。这种依赖关系不像齐普夫模型那样在坐标系统(ln r, ln p)中由一条直线近似,而是在(ln r, p)坐标系统中。基于统计标准表明,该模型与迄今为止研究最充分的基因组的密码子频率区间序列高度吻合。这一结果可被视为支持遗传文本中密码子-字母类比(而非密码子-单词类比)的另一个理由。

相似文献

1
A general rule for ranged series of codon frequencies in different genomes.不同基因组中密码子频率的分布系列的一般规则。
J Biomol Struct Dyn. 1989 Apr;6(5):1001-12. doi: 10.1080/07391102.1989.10506527.
2
An improved distribution of codon frequencies allowing for inhomogeneity of DNA's primary-structure evolution.一种改进的密码子频率分布,其考虑到了DNA一级结构进化的不均匀性。
J Biomol Struct Dyn. 1990 Apr;7(5):1185-97. doi: 10.1080/07391102.1990.10508555.
3
ISSCOR: Intragenic, Stochastic Synonymous Codon Occurrence Replacement--a new method for an alignment-free genome sequence analysis.ISSCOR:基因内随机同义密码子出现替换——一种用于无比对基因组序列分析的新方法。
C R Biol. 2009 Apr;332(4):336-50. doi: 10.1016/j.crvi.2008.11.008. Epub 2009 Feb 3.
4
A complementary circular code in the protein coding genes.蛋白质编码基因中的一种互补循环码。
J Theor Biol. 1996 Sep 7;182(1):45-58. doi: 10.1006/jtbi.1996.0142.
5
[Evolutionary changes in the genetic code, predictable on basis of the hypothesis of physical predetermination of the structure of codon bases].[基于密码子碱基结构物理预定假说可预测的遗传密码进化变化]
Genetika. 1982 Mar;18(3):499-502.
6
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
7
Degrees of divergence in the E. coli genome from correlations between dinucleotide, trinucleotide and codon frequencies.大肠杆菌基因组中,二核苷酸、三核苷酸和密码子频率之间相关性的差异程度。
J Biomol Struct Dyn. 1984 Aug;2(1):101-18. doi: 10.1080/07391102.1984.10507550.
8
Analysis on the distribution of bases in 1487 human protein coding sequences.
J Theor Biol. 1994 Mar 21;167(2):161-6. doi: 10.1006/jtbi.1994.1060.
9
The use of logistic models for the analysis of codon frequencies of DNA sequences in terms of explanatory variables.使用逻辑模型根据解释变量分析DNA序列的密码子频率。
Biometrics. 1994 Dec;50(4):1054-63.
10
Factors affecting mito-nuclear codon usage interactions in the OXPHOS system of Drosophila melanogaster.影响黑腹果蝇氧化磷酸化系统中线粒体-细胞核密码子使用相互作用的因素。
J Genet Genomics. 2008 Dec;35(12):729-35. doi: 10.1016/S1673-8527(08)60228-3.

引用本文的文献

1
Re-evaluating Phoneme Frequencies.重新评估音素频率。
Front Psychol. 2020 Nov 20;11:570895. doi: 10.3389/fpsyg.2020.570895. eCollection 2020.
2
Gaussian-Distributed Codon Frequencies of Genomes.基因组中高斯分布的密码子频率。
G3 (Bethesda). 2019 May 7;9(5):1449-1456. doi: 10.1534/g3.118.200939.
3
Scale-free networks versus evolutionary drift.无标度网络与进化漂移
Comput Biol Chem. 2004 Oct;28(4):257-64. doi: 10.1016/j.compbiolchem.2004.07.001.
4
WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences.WORDUP:一种用于在DNA序列中发现具有统计学意义模式的高效算法。
Nucleic Acids Res. 1992 Jun 11;20(11):2871-5. doi: 10.1093/nar/20.11.2871.