Suppr超能文献

BSGatlas:一个具有增强信息访问功能的统一基因组和转录组注释图谱。

BSGatlas: a unified genome and transcriptome annotation atlas with enhanced information access.

机构信息

Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark.

Division of Oncogenomics, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands.

出版信息

Microb Genom. 2021 Feb;7(2). doi: 10.1099/mgen.0.000524.

Abstract

A large part of our current understanding of gene regulation in Gram-positive bacteria is based on , as it is one of the most well studied bacterial model systems. The rapid growth in data concerning its molecular and genomic biology is distributed across multiple annotation resources. Consequently, the interpretation of data from further experiments becomes increasingly challenging in both low- and large-scale analyses. Additionally, annotation of structured RNA and non-coding RNA (ncRNA), as well as the operon structure, is still lagging behind the annotation of the coding sequences. To address these challenges, we created the genome atlas, BSGatlas, which integrates and unifies multiple existing annotation resources. Compared to any of the individual resources, the BSGatlas contains twice as many ncRNAs, while improving the positional annotation for 70 % of the ncRNAs. Furthermore, we combined known transcription start and termination sites with lists of known co-transcribed gene sets to create a comprehensive transcript map. The combination with transcription start/termination site annotations resulted in 717 new sets of co-transcribed genes and 5335 untranslated regions (UTRs). In comparison to existing resources, the number of 5' and 3' UTRs increased nearly fivefold, and the number of internal UTRs doubled. The transcript map is organized in 2266 operons, which provides transcriptional annotation for 92 % of all genes in the genome compared to the at most 82 % by previous resources. We predicted an off-target-aware genome-wide library of CRISPR-Cas9 guide RNAs, which we also linked to polycistronic operons. We provide the BSGatlas in multiple forms: as a website (https://rth.dk/resources/bsgatlas/), an annotation hub for display in the UCSC genome browser, supplementary tables and standardized GFF3 format, which can be used in large scale -omics studies. By complementing existing resources, the BSGatlas supports analyses of the genome and its molecular biology with respect to not only non-coding genes but also genome-wide transcriptional relationships of all genes.

摘要

我们目前对革兰氏阳性菌基因调控的理解很大程度上基于 ,因为它是研究得最为透彻的细菌模式系统之一。有关其分子和基因组生物学的数据呈指数级增长,分布在多个注释资源中。因此,无论是在小规模还是大规模分析中,进一步实验数据的解释都变得越来越具有挑战性。此外,对结构 RNA 和非编码 RNA(ncRNA)以及操纵子结构的注释仍然落后于编码序列的注释。为了解决这些挑战,我们创建了 基因组图谱,BSGalas,它整合并统一了多个现有的注释资源。与任何单个资源相比,BSGalas 包含两倍的 ncRNA,同时改进了 70%的 ncRNA 的位置注释。此外,我们将已知的转录起始和终止位点与已知的共转录基因集列表相结合,创建了一个全面的转录图谱。与转录起始/终止位点注释相结合,产生了 717 个新的共转录基因集和 5335 个非翻译区(UTR)。与现有资源相比,5'和 3'UTR 的数量增加了近五倍,内部 UTR 的数量增加了一倍。转录图谱组织在 2266 个操纵子中,与之前资源最多只能达到 82%的转录相比,它为基因组中 92%的所有基因提供了转录注释。我们预测了一个针对脱靶的全基因组 CRISPR-Cas9 向导 RNA 文库,我们还将其与多顺反子操纵子相关联。我们以多种形式提供 BSGalaz:网站(https://rth.dk/resources/bsgatlas/)、UCSC 基因组浏览器中的显示注释中心、补充表格和标准化的 GFF3 格式,可用于大规模 -omics 研究。通过补充现有资源,BSGalas 支持对 基因组及其分子生物学进行分析,不仅包括非编码基因,还包括所有基因的全基因组转录关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8eec/8208703/4c1872824a89/mgen-7-524-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验