Suppr超能文献

人类进化中与多聚腺苷酸化相关的异构体转换通过全长转录本结构揭示。

Polyadenylation-related isoform switching in human evolution revealed by full-length transcript structure.

机构信息

Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.

College of Future Technology, Peking University, Beijing, China.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab157.

Abstract

Rhesus macaque is a unique nonhuman primate model for human evolutionary and translational study, but the error-prone gene models critically limit its applications. Here, we de novo defined full-length macaque gene models based on single molecule, long-read transcriptome sequencing in four macaque tissues (frontal cortex, cerebellum, heart and testis). Overall, 8 588 227 poly(A)-bearing complementary DNA reads with a mean length of 14 106 nt were generated to compile the backbone of macaque transcripts, with the fine-scale structures further refined by RNA sequencing and cap analysis gene expression sequencing data. In total, 51 605 macaque gene models were accurately defined, covering 89.7% of macaque or 75.7% of human orthologous genes. Based on the full-length gene models, we performed a human-macaque comparative analysis on polyadenylation (PA) regulation. Using macaque and mouse as outgroup species, we identified 79 distal PA events newly originated in humans and found that the strengthening of the distal PA sites, rather than the weakening of the proximal sites, predominantly contributes to the origination of these human-specific isoforms. Notably, these isoforms are selectively constrained in general and contribute to the temporospatially specific reduction of gene expression, through the tinkering of previously existed mechanisms of nuclear retention and microRNA (miRNA) regulation. Overall, the protocol and resource highlight the application of bioinformatics in integrating multilayer genomics data to provide an intact reference for model animal studies, and the isoform switching detected may constitute a hitherto underestimated regulatory layer in shaping the human-specific transcriptome and phenotypic changes.

摘要

恒河猴是研究人类进化和转化的独特非人类灵长类动物模型,但易错的基因模型严重限制了其应用。在这里,我们基于单分子、长读长转录组测序,在四个恒河猴组织(额叶皮层、小脑、心脏和睾丸)中从头定义了全长恒河猴基因模型。总体而言,我们生成了 8588227 条带有 poly(A) 的 cDNA 读长,平均长度为 14106nt,这些读长构成了恒河猴转录本的主干,通过 RNA 测序和帽分析基因表达测序数据进一步细化了精细结构。总共准确地定义了 51605 个恒河猴基因模型,覆盖了 89.7%的恒河猴或 75.7%的人类同源基因。基于全长基因模型,我们对多聚腺苷酸化 (PA) 调控进行了恒河猴和人类的比较分析。使用恒河猴和小鼠作为外群物种,我们鉴定了 79 个新起源于人类的远端 PA 事件,发现远端 PA 位点的增强,而不是近端位点的减弱,主要导致了这些人类特异性异构体的产生。值得注意的是,这些异构体通常受到选择性限制,并通过先前存在的核保留和 microRNA (miRNA) 调控机制的微调,有助于基因表达的时空特异性减少。总的来说,该方案和资源突出了生物信息学在整合多层次基因组学数据方面的应用,为模型动物研究提供了完整的参考,检测到的异构体转换可能构成了迄今为止被低估的调节层,用于塑造人类特异性转录组和表型变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fe/8574621/11478b53e577/bbab157f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验