Verma Paras, Thakur Deeksha, Pandit Shashi B
Department of Biological Sciences, Indian Institute of Science Education and Research (IISER)-Mohali, Punjab, 140306, India.
Bioinform Adv. 2024 Oct 29;4(1):vbae157. doi: 10.1093/bioadv/vbae157. eCollection 2024.
Gene transcripts are distinguished by the composition of their exons, and this different exon composition may contribute to advancing proteome complexity. Despite the availability of alternative splicing information documented in various databases, a ready association of exonic variations to the protein sequence remains a mammoth task.
To associate exonic variation(s) with the protein systematically, we designed the Exon Nomenclature and Classification of Transcripts (ENACT) framework for uniquely annotating exons that tracks their loci in gene architecture context with encapsulating variations in splice site(s) and amino acid coding status. After ENACT annotation, predicted protein features (secondary structure/disorder/Pfam domains) are mapped to exon attributes. Thus, ENACTdb provides trackable exonic variation(s) association to isoform(s) and protein features, enabling the assessment of functional variation due to changes in exon composition. Such analyses can be readily performed through multiple views supported by the server. The exon-centric visualizations of ENACT annotated isoforms could provide insights on the functional repertoire of genes due to alternative splicing and its related processes and can serve as an important resource for the research community.
The database is publicly available at https://www.iscbglab.in/enactdb/. It contains protein-coding genes and isoforms for , , , , and .
基因转录本通过其外显子组成来区分,这种不同的外显子组成可能有助于提高蛋白质组的复杂性。尽管各种数据库中记录了可变剪接信息,但将外显子变异与蛋白质序列进行直接关联仍然是一项艰巨的任务。
为了系统地将外显子变异与蛋白质关联起来,我们设计了外显子命名和转录本分类(ENACT)框架,用于唯一注释外显子,该框架在基因结构背景下跟踪其位点,并封装剪接位点和氨基酸编码状态的变异。经过ENACT注释后,预测的蛋白质特征(二级结构/无序/Pfam结构域)被映射到外显子属性上。因此,ENACTdb提供了可追踪的外显子变异与异构体和蛋白质特征的关联,从而能够评估由于外显子组成变化而导致的功能变异。通过服务器支持的多个视图可以轻松进行此类分析。ENACT注释的异构体以外显子为中心的可视化可以提供关于可变剪接及其相关过程导致的基因功能库的见解,并可以作为研究界的重要资源。
该数据库可在https://www.iscbglab.in/enactdb/上公开获取。它包含了[物种名称1]、[物种名称2]、[物种名称3]、[物种名称4]和[物种名称5]的蛋白质编码基因和异构体。