Tan Do Yew, Hair Bejo Mohd, Aini Ideris, Omar Abdul Rahman, Goh Yong Meng
Department of Veterinary Pathology and Microbiology, Faculty of Veterinary Medicine, Universiti Putra Malaysia, 43400, UPM, Serdang, Selangor, Malaysia.
Virus Genes. 2004 Jan;28(1):41-53. doi: 10.1023/B:VIRU.0000012262.89898.c7.
Base usage and dinucleotide frequency have been extensively studied in many eukaryotic organisms and bacteria, but not for viruses. In this paper, a comprehensive analysis of these aspects for infectious bursal disease virus (IBDV) was presented. The analysis of base usage indicated that all of the IBDV genes possess equivalent overall nucleotide distributions. However when the base usage at each codon positions was analysed by using cluster analysis, the VP5 open reading frame (ORF) formed a different cluster isolated from the other genes. The unusual base usage of VP5 ORF may indicate that the gene was originated by the virus "overprinting strategy", a strategy in which virus may create novel gene by utilizing the unused reading frames of its existing genes. Meanwhile, the GC content of the IBDV genes and the chicken's coding sequences was comparable; suggesting the virus imitation of the host to increase its translational efficiency. The analysis of dinucleotide frequency indicated that IBDV genome had dinucleotide bias: the frequencies of CpG and TpA were lower and the TpG was higher than the expected. Classical methylation pathway, a process where CpG converted to TpG, may explain the significant correlation between the CpG deficiency and TpG abundance. "Principal component analysis of the dinucleotide frequencies" (DF-PCA) was used to analyse the overall dinucleotide frequencies of IBDV genome. DF-PCA on the hypervariable region and polyprotein (VPX-VP4-VP3) gene showed that the very virulent IBDV (vvIBDV) was segregated from other strains; which meant vvIBDV had a unique dinucleotide pattern. In summary, the study of base usage and dinucleotide frequency had unravelled many overlooked genomic properties of the virus.
碱基使用情况和二核苷酸频率在许多真核生物和细菌中已得到广泛研究,但病毒方面尚未涉及。本文对传染性法氏囊病病毒(IBDV)的这些方面进行了全面分析。碱基使用情况分析表明,所有IBDV基因具有等效的总体核苷酸分布。然而,当通过聚类分析对每个密码子位置的碱基使用情况进行分析时,VP5开放阅读框(ORF)形成了一个与其他基因分离的不同聚类。VP5 ORF不寻常的碱基使用情况可能表明该基因起源于病毒的“重叠策略”,即病毒利用其现有基因未使用的阅读框创造新基因的策略。同时,IBDV基因的GC含量与鸡的编码序列相当,这表明病毒模仿宿主以提高其翻译效率。二核苷酸频率分析表明,IBDV基因组存在二核苷酸偏向性:CpG和TpA的频率较低,而TpG的频率高于预期。经典的甲基化途径,即CpG转化为TpG的过程,可能解释了CpG缺乏与TpG丰度之间的显著相关性。“二核苷酸频率主成分分析”(DF - PCA)用于分析IBDV基因组的总体二核苷酸频率。对高变区和多聚蛋白(VPX - VP4 - VP3)基因进行的DF - PCA表明,超强毒IBDV(vvIBDV)与其他毒株分离,这意味着vvIBDV具有独特的二核苷酸模式。总之,对碱基使用情况和二核苷酸频率的研究揭示了该病毒许多被忽视的基因组特性。