Schlub Timothy E, Holmes Edward C
Sydney School of Public Health, Faculty of Medicine and Health,The University of Sydney, NSW, 2006, Australia.
School of Life and Environmental Sciences and School of Medical Sciences, Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, NSW 2006, Australia.
Virus Evol. 2020 Feb 13;6(1):veaa009. doi: 10.1093/ve/veaa009. eCollection 2020 Jan.
Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.
重叠基因在病毒中很常见,并且在其功能和进化中发挥着重要作用。然而,除了针对特定病毒群体的研究外,对于整个病毒中基因重叠的程度、性质及其决定因素,人们了解得相对较少。在此,我们通过分析NCBI病毒基因组数据库中的参考基因组,对病毒中的基因重叠进行了广泛的特征描述。我们发现,超过一半的基因重叠实例非常小,覆盖长度小于10个核苷酸,并且84%的重叠长度小于50个核苷酸。尽管如此,所有病毒中仍有53%包含50个核苷酸或更长的基因重叠。我们还研究了基因重叠的几个预测因素,如基因组结构(单链和双链RNA及DNA)、病毒科、基因组长度和基因组分段情况。结果表明,基因重叠在DNA病毒中比在RNA病毒中更频繁出现,在单链病毒中比在双链病毒中更频繁出现。基因组分段也与基因重叠有关,特别是在单链DNA病毒中。值得注意的是,我们在所有基因组类型的病毒科中观察到了广泛的重叠频率范围,这表明基因重叠是一种常见的进化特征,为所有病毒科提供了灵活的基因组结构。