Ma Xiang-Ru, Xiao Shao-Bo, Guo Ai-Zhen, Lv Jian-Qing, Chen Huan-Chun
Laboratory of Animal Virology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan 430070, China.
Acta Biochim Biophys Sin (Shanghai). 2004 Jan;36(1):16-20. doi: 10.1093/abbs/36.1.16.
Sueoka and Lobry declared respectively that, in the absence of bias between the two DNA strands for mutation and selection, the base composition within each strand should be A=T and C=G (this state is called Parity Rule type 2, PR2). However, the genome sequences of many bacteria, vertebrates and viruses showed asymmetries in base composition and gene direction. To determine the relationship of base composition skews with replication orientation, gene function, codon usage biases and phylogenetic evolution, in this paper a program called DNAskew was developed for the statistical analysis of strand asymmetry and codon composition bias in the DNA sequence. In addition, the program can also be used to predict the replication boundaries of genome sequences. The method builds on the fact that there are compositional asymmetries between the leading and the lagging strand for replication. DNAskew was written in Perl script language and implemented on the LINUX operating system. It works quickly with annotated or unannotated sequences in GBFF (GenBank flatfile) or fasta format. The source code is freely available for academic use at http://www.epizooty.com/pub/stat/DNAskew.
末冈和洛布里分别宣称,在两条DNA链在突变和选择上不存在偏差的情况下,每条链内的碱基组成应该是A=T且C=G(这种状态被称为奇偶规则类型2,PR2)。然而,许多细菌、脊椎动物和病毒的基因组序列显示出碱基组成和基因方向上的不对称性。为了确定碱基组成偏斜与复制方向、基因功能、密码子使用偏好和系统发育进化之间的关系,本文开发了一个名为DNAskew的程序,用于对DNA序列中的链不对称性和密码子组成偏好进行统计分析。此外,该程序还可用于预测基因组序列的复制边界。该方法基于这样一个事实,即复制的前导链和后随链之间存在组成不对称性。DNAskew是用Perl脚本语言编写的,并在LINUX操作系统上实现。它能快速处理GBFF(GenBank平面文件)或fasta格式的带注释或不带注释的序列。源代码可在http://www.epizooty.com/pub/stat/DNAskew上免费获取以供学术使用。