Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
Virology. 2021 Oct;562:149-157. doi: 10.1016/j.virol.2021.07.011. Epub 2021 Jul 28.
Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.
已在 SARS-CoV-2 中检测到 6 个候选重叠基因,但目前的方法难以检测到最近起源的重叠基因。然而,这些基因可能编码对病毒有益的蛋白质,并为理解基因诞生提供了一个模型系统。为了补充现有检测方法,我首先证明了避免在替代阅读框中出现终止密码子的选择压力是重叠基因起源和保留的驱动力。然后,我基于这种选择压力构建了一种检测方法 CodScr。最后,我将 CodScr 与检测重叠基因其他特性的方法(如偏向性核苷酸和氨基酸组成)相结合。我检测到了两个新的 ORF(ORF-Sh 和 ORF-Mh),分别与刺突和膜基因重叠,这些 ORF 受到选择压力的影响,可能对 SARS-CoV-2 有益。ORF-Sh 和 ORF-Mh 分别以不被终止密码子打断的 ORF 形式存在于 100%和 95%的 SARS-CoV-2 基因组中。