School of Computer and Information, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
Comput Biol Chem. 2013 Feb;42:35-9. doi: 10.1016/j.compbiolchem.2012.11.003. Epub 2012 Nov 24.
The 'effective number of codons' (Nc) introduced by Frank Wright in 1990 is one of the best measures to show the state of codon usage biases in genes and genomes. Although estimate methods of Nc have been improved by several investigators since then, no one noticed that the relationship between Nc and GC3s under assumptions of no selection given by Wright has a little but significant deviation. Since the curve showing such a relationship in Nc-plot is a useful reference line to display the main features of codon usage pattern for a number of genes, its high accuracy is important and necessary. Under ideal and ultimate conditions listed in this text a computational sample of Nc versus GC3s was derived and calculated. By nonlinear regression analysis, the relationship between Nc and GC3s without synonymous codon selection can be approximated by: N(c)=2.5-s+29.5/(s(2)+(1-s)(2)), instead of Wright's: N(c)=2+s+29/(s(2)+(1-s)(2)), where s denotes GC3s. The goodness of fit analysis of both confirmed that the new formula presented in this text is more accurate than the original one. In addition, in the case of using the same estimate method of Nc, the situation in overestimation is decreased to a certain extent by using the new reference line in comparison with Wright's one.
“有效密码子数”(Nc)是 Frank Wright 于 1990 年提出的一种衡量基因和基因组中密码子使用偏性的最佳指标之一。虽然此后有几位研究人员对 Nc 的估计方法进行了改进,但没有人注意到 Wright 提出的无选择假设下 Nc 与 GC3s 之间的关系存在微小但显著的偏差。由于 Nc-plot 中显示这种关系的曲线是展示许多基因密码子使用模式主要特征的有用参考线,因此其高精度是重要且必要的。在本文列出的理想和最终条件下,对 Nc 与 GC3s 的计算样本进行了推导和计算。通过非线性回归分析,无同义密码子选择的 Nc 与 GC3s 之间的关系可以近似为:N(c)=2.5-s+29.5/(s(2)+(1-s)(2)),而不是 Wright 的:N(c)=2+s+29/(s(2)+(1-s)(2)),其中 s 表示 GC3s。拟合优度分析均证实,本文提出的新公式比原始公式更准确。此外,在使用相同的 Nc 估计方法的情况下,与 Wright 的参考线相比,使用新的参考线在一定程度上减少了高估的情况。