Guo Feng-Biao, Yu Xiu-Juan
School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
J Virol Methods. 2007 Dec;146(1-2):389-92. doi: 10.1016/j.jviromet.2007.07.010. Epub 2007 Aug 23.
Using the Z curve method, the protein-coding genes in AmEPV genome are re-predicted. On the basis of the parameters trained on the experimentally validated genes, all of the 30 experimentally validated genes and 67 putative genes are predicted correctly as coding genes. The sensitivities of the present method for self-test and cross-validation are all 100% based on these test sets. Thirty-eight annotated conserved and hypothetical genes are predicted as non-coding ORFs. The number of re-predicted protein-coding genes in AmEPV is 256. It is significantly less than the number 294 reported in the original annotation. After extending the present method trained in AeEPV genome to the other entomopoxvirus genome, it is found that 116 of the 123 known and putative genes are predicted correctly as coding. Six of the seven falsely missed genes are less than 300bp. The present method could be extended to other poxvirus genomes with or without adaptation of training sets.
使用Z曲线方法对AmEPV基因组中的蛋白质编码基因进行重新预测。基于在经过实验验证的基因上训练的参数,所有30个经过实验验证的基因和67个推定基因均被正确预测为编码基因。基于这些测试集,本方法的自检和交叉验证灵敏度均为100%。38个注释的保守基因和假设基因被预测为非编码开放阅读框。AmEPV中重新预测的蛋白质编码基因数量为256个,明显少于原始注释中报道的294个。将在AeEPV基因组中训练的本方法扩展到其他昆虫痘病毒基因组后,发现123个已知和推定基因中有116个被正确预测为编码基因。7个错误遗漏的基因中有6个小于300bp。本方法可扩展到其他痘病毒基因组,无论是否调整训练集。