Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA.
Science. 2010 Dec 24;330(6012):1775-87. doi: 10.1126/science.1196914. Epub 2010 Dec 22.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
我们系统性地生成了大规模数据集,以改进线虫秀丽隐杆线虫的基因组注释,这是一种重要的模式生物。这些数据集包括发育时间过程中的转录组分析、转录因子结合位点的全基因组鉴定以及染色质组织图谱。由此,我们创建了更完整和准确的基因模型,包括选择性剪接形式和候选非编码 RNA。我们构建了转录因子结合和 microRNA 相互作用的层次网络,并发现了大量转录因子结合的染色体位置。在染色体臂和中心之间揭示了不同的染色质组成和组蛋白修饰模式,常染色体和 X 染色体之间也存在类似的显著差异。通过整合不同的数据类型,我们构建了与染色质、转录因子结合和基因表达相关的统计模型。总的来说,我们的分析将保守基因组的大部分赋予了假定的功能。