Suppr超能文献

冠状病毒基因组中辅助基因的特征。

Characterization of accessory genes in coronavirus genomes.

机构信息

Laboratoire ICube, Department of Computer Science, CNRS, University of Strasbourg, F-67412, Strasbourg, France.

Unité de Microbiologie Structurale, Institut Pasteur, CNRS UMR 3528, 75724, Paris Cedex 15, France.

出版信息

Virol J. 2020 Aug 27;17(1):131. doi: 10.1186/s12985-020-01402-1.

Abstract

BACKGROUND

The Covid19 infection is caused by the SARS-CoV-2 virus, a novel member of the coronavirus (CoV) family. CoV genomes code for a ORF1a / ORF1ab polyprotein and four structural proteins widely studied as major drug targets. The genomes also contain a variable number of open reading frames (ORFs) coding for accessory proteins that are not essential for virus replication, but appear to have a role in pathogenesis. The accessory proteins have been less well characterized and are difficult to predict by classical bioinformatics methods.

METHODS

We propose a computational tool GOFIX to characterize potential ORFs in virus genomes. In particular, ORF coding potential is estimated by searching for enrichment in motifs of the X circular code, that is known to be over-represented in the reading frames of viral genes.

RESULTS

We applied GOFIX to study the SARS-CoV-2 and related genomes including SARS-CoV and SARS-like viruses from bat, civet and pangolin hosts, focusing on the accessory proteins. Our analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations. In contrast, we predict that ORF3b is not functional in all genomes. Novel putative ORFs were also predicted, including a truncated form of the ORF10 previously identified in SARS-CoV-2 and a little known ORF overlapping the Spike protein in Civet-CoV and SARS-CoV.

CONCLUSIONS

Our findings contribute to characterizing sequence properties of accessory genes of SARS coronaviruses, and especially the newly acquired genes making use of overlapping reading frames.

摘要

背景

Covid19 感染是由 SARS-CoV-2 病毒引起的,这是冠状病毒(CoV)家族的一种新型成员。CoV 基因组编码一个 ORF1a/ORF1ab 多蛋白和四个结构蛋白,这些蛋白被广泛研究为主要的药物靶点。基因组还包含数量可变的开放阅读框(ORFs),编码非病毒复制所必需的辅助蛋白,但似乎在发病机制中起作用。辅助蛋白的特征不太明显,并且难以用经典的生物信息学方法进行预测。

方法

我们提出了一种计算工具 GOFIX 来描述病毒基因组中的潜在 ORF。具体来说,通过搜索 X 圆形密码子的基序富集来估计 ORF 编码潜力,已知该基序在病毒基因的阅读框中过度表达。

结果

我们应用 GOFIX 研究了 SARS-CoV-2 及其相关基因组,包括来自蝙蝠、果子狸和穿山甲宿主的 SARS-CoV 和 SARS 样病毒,重点研究了辅助蛋白。我们的分析提供了证据支持所有基因组中存在重叠的 ORF7b、9b 和 9c,从而有助于解决当前基因组注释中的一些差异。相比之下,我们预测 ORF3b 在所有基因组中都没有功能。还预测了新的潜在 ORF,包括以前在 SARS-CoV-2 中鉴定的 ORF10 的截断形式,以及在果子狸-CoV 和 SARS-CoV 中与 Spike 蛋白重叠的鲜为人知的 ORF。

结论

我们的发现有助于描述 SARS 冠状病毒辅助基因的序列特性,特别是利用重叠阅读框获得的新基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73c0/7457249/11e043d0ad96/12985_2020_1402_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验