GTEx 数据集研究的人类蛋白编码基因中排名最高的表达基因转录本。

Top-ranked expressed gene transcripts of human protein-coding genes investigated with GTEx dataset.

机构信息

Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan, ROC.

Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan, ROC.

出版信息

Sci Rep. 2020 Oct 1;10(1):16245. doi: 10.1038/s41598-020-73081-5.

Abstract

With considerable accumulation of RNA-Seq transcriptome data, we have extended our understanding about protein-coding gene transcript compositions. However, alternatively compounded patterns of human protein-coding gene transcripts would complicate gene expression data processing and interpretation. It is essential to exhaustively interrogate complex mRNA isoforms of protein-coding genes with an unified data resource. In order to investigate representative mRNA transcript isoforms to be utilized as transcriptome analysis references, we utilized GTEx data to establish a top-ranked transcript isoform expression data resource for human protein-coding genes. Distinctive tissue specific expression profiles and modulations could be observed for individual top-ranked transcripts of protein-coding genes. Protein-coding transcripts or genes do occupy much higher expression fraction in transcriptome data. In addition, top-ranked transcripts are the dominantly expressed ones in various normal tissues. Intriguingly, some of the top-ranked transcripts are noncoding splicing isoforms, which imply diverse gene regulation mechanisms. Comprehensive investigation on the tissue expression patterns of top-ranked transcript isoforms is crucial. Thus, we established a web tool to examine top-ranked transcript isoforms in various human normal tissue types, which provides concise transcript information and easy-to-use graphical user interfaces. Investigation of top-ranked transcript isoforms would contribute understanding on the functional significance of distinctive alternatively spliced transcript isoforms.

摘要

随着 RNA-Seq 转录组数据的大量积累,我们对蛋白质编码基因转录组成的理解已经扩展。然而,人类蛋白质编码基因转录本的替代性复合模式会使基因表达数据的处理和解释变得复杂。因此,利用统一的数据资源全面研究复杂的 mRNA 同种型是至关重要的。为了研究可作为转录组分析参考的代表性 mRNA 转录本异构体,我们利用 GTEx 数据为人类蛋白质编码基因建立了顶级转录本异构体表达数据资源。可以观察到单个蛋白质编码基因的顶级转录本具有独特的组织特异性表达谱和调控模式。蛋白质编码转录本或基因在转录组数据中占据更高的表达分数。此外,顶级转录本在各种正常组织中表达水平更高。有趣的是,一些顶级转录本是非编码剪接异构体,这表明存在多样化的基因调控机制。全面研究顶级转录本异构体的组织表达模式至关重要。因此,我们建立了一个网络工具,用于检查各种人类正常组织类型中的顶级转录本异构体,提供简洁的转录本信息和易于使用的图形用户界面。对顶级转录本异构体的研究有助于理解不同的选择性剪接转录本异构体的功能意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0093/7530651/b34e28b81cbb/41598_2020_73081_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索