Suppr超能文献

在ENCODE区域中5'远端转录起始位点的显著使用以及大量额外外显子的发现。

Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions.

作者信息

Denoeud France, Kapranov Philipp, Ucla Catherine, Frankish Adam, Castelo Robert, Drenkow Jorg, Lagarde Julien, Alioto Tyler, Manzano Caroline, Chrast Jacqueline, Dike Sujit, Wyss Carine, Henrichsen Charlotte N, Holroyd Nancy, Dickson Mark C, Taylor Ruth, Hance Zahra, Foissac Sylvain, Myers Richard M, Rogers Jane, Hubbard Tim, Harrow Jennifer, Guigó Roderic, Gingeras Thomas R, Antonarakis Stylianos E, Reymond Alexandre

机构信息

Grup de Recerca en Informática Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain.

出版信息

Genome Res. 2007 Jun;17(6):746-59. doi: 10.1101/gr.5660607.

Abstract

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.

摘要

本报告展示了对DNA元件百科全书(ENCODE)试点项目所针对的人类基因组1%区域内399个注释蛋白编码基因座的转录产物进行的系统实证注释,采用了5' cDNA末端快速扩增(RACE)和高密度分辨率平铺阵列相结合的方法。我们鉴定出了先前未注释的、通常具有组织或细胞系特异性的转录片段(RACE片段),这些片段位于注释的5'末端的5'远端以及绝大多数(81.5%)测试基因的注释基因边界内。一半的远端RACE片段跨越了远离编码转录本主要部分的大片段基因组序列,并且常常与上游注释的基因重叠。值得注意的是,至少20%的新转录本在其开放阅读框(ORF)中有变化,其中大多数融合了相邻转录本的ORF。相当一部分远端RACE片段的表达水平与同一基因座已知外显子的表达水平相当,这表明它们并非极少数剪接形式的一部分。这些结果对于(1)我们目前对蛋白编码基因结构的理解;(2)我们对基因组中调控区域位置的看法;以及(3)映射到迄今被认为是“非编码”区域的序列多态性的解释具有重要意义,最终与疾病相关序列改变的鉴定有关。

相似文献

10
Transcribed dark matter: meaning or myth?转录暗物质:意义还是神话?
Hum Mol Genet. 2010 Oct 15;19(R2):R162-8. doi: 10.1093/hmg/ddq362. Epub 2010 Aug 25.

引用本文的文献

3
Alternative isoform expression of key thermogenic genes in human beige adipocytes.关键生热基因在人褐色脂肪细胞中的异构体表达。
Front Endocrinol (Lausanne). 2024 May 24;15:1395750. doi: 10.3389/fendo.2024.1395750. eCollection 2024.
8
Regulation of mTOR signaling by long non-coding RNA.mTOR 信号通路的长链非编码 RNA 调控。
Biochim Biophys Acta Gene Regul Mech. 2020 Apr;1863(4):194449. doi: 10.1016/j.bbagrm.2019.194449. Epub 2019 Nov 18.

本文引用的文献

6
GENCODE: producing a reference annotation for ENCODE.GENCODE:为ENCODE生成参考注释。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7.
7
Complex Loci in human and mouse genomes.人类和小鼠基因组中的复杂基因座。
PLoS Genet. 2006 Apr;2(4):e47. doi: 10.1371/journal.pgen.0020047. Epub 2006 Apr 28.
9
Evolutionary fate of retroposed gene copies in the human genome.人类基因组中反转录基因拷贝的进化命运。
Proc Natl Acad Sci U S A. 2006 Feb 28;103(9):3220-5. doi: 10.1073/pnas.0511307103. Epub 2006 Feb 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验