Suppr超能文献

SinEx DB 2.0 更新 2020 版:真核生物单外显子编码序列数据库。

SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences.

机构信息

Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Zañartu 1482, Ñuñoa Santiago 7780132, Chile.

Laboratorio Medicina Traslacional, Fundación Arturo López Pérez, José Manuel Infante 805, Providencia, Santiago 7500691, Chile.

出版信息

Database (Oxford). 2021 Jan 28;2021. doi: 10.1093/database/baab002.

Abstract

Single-exon coding sequences (CDSs), also known as 'single-exon genes' (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL: http://v2.sinex.cl/.

摘要

单外显子编码序列(CDS),也称为“单外显子基因”(SEGs),定义为核蛋白编码基因,其 CDS 中缺乏内含子。不仅研究了它们的起源和进化,还研究了它们的表达与几种类型的人类癌症和神经/发育障碍的关系,许多 SEG 表现出组织特异性转录。我们开发了 SinEx DB,其中包含来自包括人类在内的 10 种哺乳动物基因组的 SEG 的 DNA 和蛋白质序列信息。SinEx DB 包括它们的功能预测(KOG(真核同源物组))以及这些功能在物种内的相对分布。在这里,我们报告了 SinEx 2.0,这是 SinEx DB 的一个主要更新,其中包含来自 60 个完全测序的真核基因组中 SEG 发生、分布和功能预测的信息,代表动物、真菌、原生生物和植物。信息存储在一个使用 MySQL Server 5.7 构建的关系数据库中,完整的 SEG 序列数据集及其 GO(基因本体论)功能分配可用于下载。SinEx DB 2.0 是使用一种新的流水线构建的,该流水线有助于从 SEG 中消除单外显子同工型的歧义。SinEx DB 2.0 是可用于 SEG 的最大数据库,为推进我们对 SEG 的进化、功能及其与癌症和神经发育疾病等疾病的关联的理解提供了丰富的信息来源。数据库网址:http://v2.sinex.cl/。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ef2/7904048/00ce909eb547/baab002f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验