Suppr超能文献

人类ENCODE区域中高阶功能域的鉴定。

Identification of higher-order functional domains in the human ENCODE regions.

作者信息

Thurman Robert E, Day Nathan, Noble William S, Stamatoyannopoulos John A

机构信息

Division of Medical Genetics, University of Washington, Seattle, Washington 98195, USA.

出版信息

Genome Res. 2007 Jun;17(6):917-27. doi: 10.1101/gr.6081407.

Abstract

It has long been posited that human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. We hypothesized that diverse experimental data types generated by The ENCODE Project Consortium could be combined to delineate active and quiescent or repressed functional domains and thereby illuminate the higher-order functional architecture of the genome. To address this, we coupled wavelet analysis with hidden Markov models for unbiased discovery of "domain-level" behavior in high-resolution functional genomic data, including activating and repressive histone modifications, RNA output, and DNA replication timing. We find that higher-order patterns in these data types are largely concordant and may be analyzed collectively in the context of HeLa cells to delineate 53 active and 62 repressed functional domains within the ENCODE regions. Active domains comprise approximately 44% of the ENCODE regions but contain approximately 75%-80% of annotated genes, transcripts, and CpG islands. Repressed domains are enriched in certain classes of repetitive elements and, surprisingly, in evolutionarily conserved nonexonic sequences. The functional domain structure of the ENCODE regions appears to be largely stable across different cell types. Taken together, our results suggest that higher-order functional domains represent a fundamental organizing principle of human genome architecture.

摘要

长期以来,人们一直假定人类和其他大型基因组被组织成更高阶(即大于基因大小)的功能域。我们假设,由ENCODE计划联盟产生的各种实验数据类型可以结合起来,以描绘活跃和静止或受抑制的功能域,从而阐明基因组的高阶功能结构。为了解决这个问题,我们将小波分析与隐马尔可夫模型相结合,以便在高分辨率功能基因组数据中无偏地发现“域级”行为,包括激活和抑制性组蛋白修饰、RNA输出和DNA复制时间。我们发现,这些数据类型中的高阶模式在很大程度上是一致的,并且可以在HeLa细胞的背景下进行综合分析,以描绘ENCODE区域内的53个活跃和62个受抑制的功能域。活跃域约占ENCODE区域的44%,但包含约75%-80%的注释基因、转录本和CpG岛。受抑制域在某些重复元件类别中富集,令人惊讶的是,在进化上保守的非外显子序列中也有富集。ENCODE区域的功能域结构在不同细胞类型中似乎基本稳定。综上所述,我们的结果表明,高阶功能域代表了人类基因组结构的一个基本组织原则。

相似文献

8
Large replication skew domains delimit GC-poor gene deserts in human.大型复制偏斜结构域界定了人类中基因贫乏的基因沙漠区域。
Comput Biol Chem. 2014 Dec;53 Pt A:153-65. doi: 10.1016/j.compbiolchem.2014.08.020. Epub 2014 Aug 27.

引用本文的文献

5

本文引用的文献

5
Unsupervised segmentation of continuous genomic data.连续基因组数据的无监督分割
Bioinformatics. 2007 Jun 1;23(11):1424-6. doi: 10.1093/bioinformatics/btm096. Epub 2007 Mar 23.
6
GENCODE: producing a reference annotation for ENCODE.GENCODE:为ENCODE生成参考注释。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7.
7
Nuclear reprogramming and pluripotency.细胞核重编程与多能性
Nature. 2006 Jun 29;441(7097):1061-7. doi: 10.1038/nature04955.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验