Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A I-43124, Parma, Italy.
Curr Opin Virol. 2022 Feb;52:1-8. doi: 10.1016/j.coviro.2021.10.009. Epub 2021 Nov 16.
Viruses may evolve to increase the amount of encoded genetic information by means of overlapping genes, which utilize several reading frames. Such overlapping genes may be especially impactful for genomes of small size, often serving a source of novel accessory proteins, some of which play a crucial role in viral pathogenicity or in promoting the systemic spread of virus. Diverse genome-based metrics were proposed to facilitate recognition of overlapping genes that otherwise may be overlooked during genome annotation. They can detect the atypical codon bias associated with the overlap (e.g. a statistically significant reduction in variability at synonymous sites) or other sequence-composition features peculiar to overlapping genes. In this review, I compare nine computational methods, discuss their strengths and limitations, and survey how they were applied to detect candidate overlapping genes in the genome of SARS-CoV-2, the etiological agent of COVID-19 pandemic.
病毒可能通过重叠基因来增加编码遗传信息的数量,这些重叠基因利用了几个阅读框。对于小尺寸的基因组来说,这种重叠基因可能特别有影响力,它们经常是新型辅助蛋白的来源,其中一些在病毒的致病性或促进病毒的全身传播中起着至关重要的作用。已经提出了多种基于基因组的指标来帮助识别重叠基因,否则在基因组注释过程中可能会忽略这些基因。它们可以检测到与重叠相关的非典型密码子偏好(例如,同义位点的变异性显著降低)或重叠基因特有的其他序列组成特征。在这篇综述中,我比较了 9 种计算方法,讨论了它们的优缺点,并调查了它们在检测 COVID-19 大流行的病原体 SARS-CoV-2 基因组中候选重叠基因中的应用。