Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Ciudad Universitaria UNL, Santa Fe 3000, Argentina.
Bioengineering and Bioinformatics Research and Development Institute (IBB), FI-UNER, CONICET, Entre Ríos 3100, Argentina.
Bioinformatics. 2021 Apr 5;36(24):5571-5581. doi: 10.1093/bioinformatics/btaa1002.
The Severe Acute Respiratory Syndrome-Coronavirus 2 (SARS-CoV-2) has recently emerged as the responsible for the pandemic outbreak of the coronavirus disease 2019. This virus is closely related to coronaviruses infecting bats and Malayan pangolins, species suspected to be an intermediate host in the passage to humans. Several genomic mutations affecting viral proteins have been identified, contributing to the understanding of the recent animal-to-human transmission. However, the capacity of SARS-CoV-2 to encode functional putative microRNAs (miRNAs) remains largely unexplored.
We have used deep learning to discover 12 candidate stem-loop structures hidden in the viral protein-coding genome. Among the precursors, the expression of eight mature miRNAs-like sequences was confirmed in small RNA-seq data from SARS-CoV-2 infected human cells. Predicted miRNAs are likely to target a subset of human genes of which 109 are transcriptionally deregulated upon infection. Remarkably, 28 of those genes potentially targeted by SARS-CoV-2 miRNAs are down-regulated in infected human cells. Interestingly, most of them have been related to respiratory diseases and viral infection, including several afflictions previously associated with SARS-CoV-1 and SARS-CoV-2. The comparison of SARS-CoV-2 pre-miRNA sequences with those from bat and pangolin coronaviruses suggests that single nucleotide mutations could have helped its progenitors jumping inter-species boundaries, allowing the gain of novel mature miRNAs targeting human mRNAs. Our results suggest that the recent acquisition of novel miRNAs-like sequences in the SARS-CoV-2 genome may have contributed to modulate the transcriptional reprograming of the new host upon infection.
https://github.com/sinc-lab/sarscov2-mirna-discovery.
Supplementary data are available at Bioinformatics online.
严重急性呼吸系统综合症冠状病毒 2(SARS-CoV-2)最近成为导致 2019 年冠状病毒病大流行的罪魁祸首。这种病毒与感染蝙蝠和马来亚穿山甲的冠状病毒密切相关,这两种动物被怀疑是病毒传播到人类的中间宿主。已经鉴定出几种影响病毒蛋白的基因组突变,有助于了解最近的动物向人类传播。然而,SARS-CoV-2 编码功能性假定微小 RNA(miRNA)的能力在很大程度上仍未得到探索。
我们使用深度学习技术在病毒蛋白编码基因组中发现了 12 个候选茎环结构。在这些前体中,在感染 SARS-CoV-2 的人类细胞的小 RNA-seq 数据中证实了 8 个成熟 miRNA 样序列的表达。预测的 miRNA 可能靶向人类基因的一部分,其中 109 个在感染时转录失调。值得注意的是,这些基因中有 28 个可能被 SARS-CoV-2 miRNAs 靶向,在感染的人类细胞中下调。有趣的是,其中大多数与呼吸疾病和病毒感染有关,包括与 SARS-CoV-1 和 SARS-CoV-2 相关的几种疾病。SARS-CoV-2 前 miRNA 序列与蝙蝠和穿山甲冠状病毒的序列进行比较表明,单核苷酸突变可能帮助其前体跨越种间界限,获得针对人类 mRNA 的新成熟 miRNA。我们的研究结果表明,SARS-CoV-2 基因组中新获得的类似 miRNA 序列可能有助于调节感染后新宿主的转录重编程。
https://github.com/sinc-lab/sarscov2-mirna-discovery。
补充数据可在生物信息学在线获得。