Department of Computer Science, University at Albany, State University of New York, Albany, NY 12222, USA.
Database (Oxford). 2012 Feb 8;2012:bas002. doi: 10.1093/database/bas002. Print 2012.
A codon consists of three nucleotides and functions during translation to dictate the insertion of a specific amino acid in a growing peptide or, in the case of stop codons, to specify the completion of protein synthesis. There are 64 possible single codons and there are 4096 double, 262 144 triple, 16 777 216 quadruple and 1 073 741 824 quintuple codon combinations available for use by specific genes and genomes. In order to evaluate the use of specific single, double, triple, quadruple and quintuple codon combinations in genes and gene networks, we have developed a codon counting tool and employed it to analyze 5780 Saccharomyces cerevisiae genes. We have also developed visualization approaches, including codon painting, combination and bar graphs, and have used them to identify distinct codon usage patterns in specific genes and groups of genes. Using our developed Gene-Specific Codon Counting Database, we have identified extreme codon runs in specific genes. We have also demonstrated that specific codon combinations or usage patterns are over-represented in genes whose corresponding proteins belong to ribosome or translation-associated biological processes. Our resulting database provides a mineable list of multi-codon data and can be used to identify unique sequence runs and codon usage patterns in individual and functionally linked groups of genes.
密码子由三个核苷酸组成,在翻译过程中起作用,用于指定在不断增长的肽链中插入特定的氨基酸,或者在终止密码子的情况下,指定蛋白质合成的完成。有 64 种可能的单密码子,4096 种双密码子、262144 种三密码子、16777216 种四密码子和 1073741824 种五密码子组合可供特定基因和基因组使用。为了评估特定基因和基因网络中特定单、双、三、四和五密码子组合的使用情况,我们开发了一种密码子计数工具,并将其用于分析 5780 个酿酒酵母基因。我们还开发了可视化方法,包括密码子作图、组合和条形图,并将其用于识别特定基因和基因群中独特的密码子使用模式。使用我们开发的基因特异性密码子计数数据库,我们在特定基因中确定了极端密码子序列。我们还证明,在相应蛋白质属于核糖体或翻译相关生物过程的基因中,特定的密码子组合或使用模式过度表达。我们的数据库提供了可挖掘的多密码子数据列表,可以用于识别单个和功能相关的基因群中独特的序列和密码子使用模式。