Cawley Simon, Bekiranov Stefan, Ng Huck H, Kapranov Philipp, Sekinger Edward A, Kampa Dione, Piccolboni Antonio, Sementchenko Victor, Cheng Jill, Williams Alan J, Wheeler Raymond, Wong Brant, Drenkow Jorg, Yamanaka Mark, Patel Sandeep, Brubaker Shane, Tammana Hari, Helt Gregg, Struhl Kevin, Gingeras Thomas R
Affymetrix, 3380 Central Expressway, Santa Clara, CA 95051, USA.
Cell. 2004 Feb 20;116(4):499-509. doi: 10.1016/s0092-8674(04)00127-8.
Using high-density oligonucleotide arrays representing essentially all nonrepetitive sequences on human chromosomes 21 and 22, we map the binding sites in vivo for three DNA binding transcription factors, Sp1, cMyc, and p53, in an unbiased manner. This mapping reveals an unexpectedly large number of transcription factor binding site (TFBS) regions, with a minimal estimate of 12,000 for Sp1, 25,000 for cMyc, and 1600 for p53 when extrapolated to the full genome. Only 22% of these TFBS regions are located at the 5' termini of protein-coding genes while 36% lie within or immediately 3' to well-characterized genes and are significantly correlated with noncoding RNAs. A significant number of these noncoding RNAs are regulated in response to retinoic acid, and overlapping pairs of protein-coding and noncoding RNAs are often coregulated. Thus, the human genome contains roughly comparable numbers of protein-coding and noncoding genes that are bound by common transcription factors and regulated by common environmental signals.
我们使用高密度寡核苷酸阵列,其代表了人类21号和22号染色体上基本上所有的非重复序列,以无偏倚的方式在体内绘制了三种DNA结合转录因子Sp1、cMyc和p53的结合位点图谱。这种图谱揭示了数量惊人的转录因子结合位点(TFBS)区域,外推至全基因组时,对Sp1的估计最少为12000个,cMyc为25000个,p53为1600个。这些TFBS区域中只有22%位于蛋白质编码基因的5'末端,而36%位于已充分表征的基因内部或紧邻其3'端,并且与非编码RNA显著相关。这些非编码RNA中有相当一部分受视黄酸调控,蛋白质编码RNA和非编码RNA的重叠对通常是共调控的。因此,人类基因组中由共同转录因子结合并受共同环境信号调控的蛋白质编码基因和非编码基因数量大致相当。