University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA.
The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Nature. 2020 Jul;583(7818):699-710. doi: 10.1038/s41586-020-2493-4. Epub 2020 Jul 29.
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE and Roadmap Epigenomics data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
人类和小鼠基因组包含了指定 RNA 和蛋白质的指令,并控制它们产生的时间、数量和细胞环境。为了更好地描绘这些元素,DNA 元件百科全书(ENCODE)项目的第三阶段扩大了对 RNA 转录、染色质结构和修饰、DNA 甲基化、染色质环化以及转录因子和 RNA 结合蛋白占据的细胞和组织谱的分析。在这里,我们总结了这些努力,这些努力产生了 5992 个新的实验数据集,包括对小鼠胎儿发育的系统测定。所有数据都可以通过 ENCODE 数据门户(https://www.encodeproject.org)获得,包括第二阶段的 ENCODE 和路线图表观基因组学数据。我们通过整合与基因调控相关的选定数据类型,开发了一个包含 926535 个人类和 339815 个小鼠候选顺式调控元件的注册中心,分别覆盖了它们各自基因组的 7.9%和 3.4%,并构建了一个基于网络的服务器(SCREEN;http://screen.encodeproject.org),以提供对该资源的灵活、用户定义的访问。总之,ENCODE 数据和注册中心为科学界提供了一个广泛的资源,以更好地理解人类和小鼠基因组的组织和功能。