Department of Genetics, Yale University, New Haven, Connecticut 06520.
Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, Illinois 60637.
Genetics. 2018 Mar;208(3):937-949. doi: 10.1534/genetics.117.300657. Epub 2017 Dec 28.
To develop a catalog of regulatory sites in two major model organisms, and , the modERN (model organism Encyclopedia of Regulatory Networks) consortium has systematically assayed the binding sites of transcription factors (TFs). Combined with data produced by our predecessor, modENCODE (Model Organism ENCyclopedia Of DNA Elements), we now have data for 262 TFs identifying 1.23 M sites in the fly genome and 217 TFs identifying 0.67 M sites in the worm genome. Because sites from different TFs are often overlapping and tightly clustered, they fall into 91,011 and 59,150 regions in the fly and worm, respectively, and these binding sites span as little as 8.7 and 5.8 Mb in the two organisms. Clusters with large numbers of sites (so-called high occupancy target, or HOT regions) predominantly associate with broadly expressed genes, whereas clusters containing sites from just a few factors are associated with genes expressed in tissue-specific patterns. All of the strains expressing GFP-tagged TFs are available at the stock centers, and the chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center and also through a simple interface (http://epic.gs.washington.edu/modERN/) that facilitates rapid accessibility of processed data sets. These data will facilitate a vast number of scientific inquiries into the function of individual TFs in key developmental, metabolic, and defense and homeostatic regulatory pathways, as well as provide a broader perspective on how individual TFs work together in local networks and globally across the life spans of these two key model organisms.
为了开发两个主要模式生物 和 的调控位点目录,modERN(模型生物调控网络百科全书)联盟系统地测定了转录因子(TFs)的结合位点。结合我们的前身 modENCODE(模型生物 DNA 元件百科全书)产生的数据,我们现在有 262 个 TF 识别 1.23 M 个在果蝇基因组中的位点和 217 个 TF 识别 0.67 M 个在蠕虫基因组中的位点。由于来自不同 TF 的位点经常重叠且紧密聚集,它们分别落入果蝇和蠕虫中的 91,011 和 59,150 个区域,这些结合位点在这两个生物体内的跨度分别只有 8.7 和 5.8 Mb。具有大量位点的簇(所谓的高占有率靶标或 HOT 区域)主要与广泛表达的基因相关,而仅包含少数几个因素的位点的簇与组织特异性表达的基因相关。表达 GFP 标记 TF 的所有菌株均可在品系中心获得,染色质免疫沉淀测序数据可通过 ENCODE 数据协调中心获得,也可通过一个简单的界面(http://epic.gs.washington.edu/modERN/)获得,该界面便于快速访问处理后的数据集。这些数据将极大地促进对单个 TF 在关键发育、代谢和防御及体内平衡调控途径中的功能的大量科学研究,并为单个 TF 如何在局部网络以及在这两个关键模式生物的整个生命周期中如何协同工作提供更广泛的视角。