Sheffield Bioinformatics Core, The University of Sheffield, Sheffield S10 2HQ, United Kingdom.
Sheffield Institute for Translational Neuroscience, The University of Sheffield, Sheffield S10 2HQ, United Kingdom.
Genome Res. 2021 Apr;31(4):645-658. doi: 10.1101/gr.268110.120. Epub 2021 Mar 15.
We have developed periscope, a tool for the detection and quantification of subgenomic RNA (sgRNA) in SARS-CoV-2 genomic sequence data. The translation of the SARS-CoV-2 RNA genome for most open reading frames (ORFs) occurs via RNA intermediates termed "subgenomic RNAs." sgRNAs are produced through discontinuous transcription, which relies on homology between transcription regulatory sequences (TRS-B) upstream of the ORF start codons and that of the TRS-L, which is located in the 5' UTR. TRS-L is immediately preceded by a leader sequence. This leader sequence is therefore found at the 5' end of all sgRNA. We applied periscope to 1155 SARS-CoV-2 genomes from Sheffield, United Kingdom, and validated our findings using orthogonal data sets and in vitro cell systems. By using a simple local alignment to detect reads that contain the leader sequence, we were able to identify and quantify reads arising from canonical and noncanonical sgRNA. We were able to detect all canonical sgRNAs at the expected abundances, with the exception of ORF10. A number of recurrent noncanonical sgRNAs are detected. We show that the results are reproducible using technical replicates and determine the optimum number of reads for sgRNA analysis. In VeroE6 +/- cell lines, periscope can detect the changes in the kinetics of sgRNA in orthogonal sequencing data sets. Finally, variants found in genomic RNA are transmitted to sgRNAs with high fidelity in most cases. This tool can be applied to all sequenced COVID-19 samples worldwide to provide comprehensive analysis of SARS-CoV-2 sgRNA.
我们开发了潜望镜(periscope),这是一种用于检测和定量 SARS-CoV-2 基因组序列数据中亚基因组 RNA(sgRNA)的工具。大多数开放阅读框(ORF)的 SARS-CoV-2 RNA 基因组的翻译是通过称为“亚基因组 RNA”的 RNA 中间体进行的。sgRNA 通过不连续转录产生,该转录依赖于 ORF 起始密码子上游的转录调节序列(TRS-B)与位于 5'UTR 中的 TRS-L 之间的同源性。TRS-L 立即被前导序列所跟随。因此,该前导序列位于所有 sgRNA 的 5'端。我们将潜望镜应用于来自英国谢菲尔德的 1155 个 SARS-CoV-2 基因组,并使用正交数据集和体外细胞系统验证了我们的发现。通过使用简单的局部比对来检测包含前导序列的读取,我们能够识别和定量来自规范和非规范 sgRNA 的读取。我们能够以预期的丰度检测到所有规范 sgRNA,但 ORF10 除外。检测到一些常见的非规范 sgRNA。我们表明,使用技术重复可以重现结果,并确定用于 sgRNA 分析的最佳读取次数。在 VeroE6 +/-细胞系中,潜望镜可以检测到正交测序数据集中 sgRNA 动力学的变化。最后,在大多数情况下,基因组 RNA 中发现的变体以高保真度传递到 sgRNA。该工具可应用于全球所有测序的 COVID-19 样本,以提供对 SARS-CoV-2 sgRNA 的全面分析。