Stark Alexander, Lin Michael F, Kheradpour Pouya, Pedersen Jakob S, Parts Leopold, Carlson Joseph W, Crosby Madeline A, Rasmussen Matthew D, Roy Sushmita, Deoras Ameya N, Ruby J Graham, Brennecke Julius, Hodges Emily, Hinrichs Angie S, Caspi Anat, Paten Benedict, Park Seung-Won, Han Mira V, Maeder Morgan L, Polansky Benjamin J, Robson Bryanne E, Aerts Stein, van Helden Jacques, Hassan Bassem, Gilbert Donald G, Eastman Deborah A, Rice Michael, Weir Michael, Hahn Matthew W, Park Yongkyu, Dewey Colin N, Pachter Lior, Kent W James, Haussler David, Lai Eric C, Bartel David P, Hannon Gregory J, Kaufman Thomas C, Eisen Michael B, Clark Andrew G, Smith Douglas, Celniker Susan E, Gelbart William M, Kellis Manolis
The Broad Institute, Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02140, USA.
Nature. 2007 Nov 8;450(7167):219-32. doi: 10.1038/nature06340.
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or 'evolutionary signatures', dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies.
对多个相关物种进行测序,随后进行比较基因组学分析,这构成了一种系统理解任何基因组的强大方法。在此,我们利用12种果蝇的基因组来从头发现果蝇中的功能元件。每种类型的功能元件都显示出由其精确的选择限制所决定的特征性变化模式,即“进化特征”。这些特征能够识别新的蛋白质编码基因和外显子、虚假和错误的基因注释,以及众多不寻常的基因结构,包括大量的终止密码子通读。同样,我们预测非蛋白质编码RNA基因和结构以及新的微小RNA(miRNA)基因。我们提供了来自发夹臂和两条DNA链的miRNA加工和功能的证据。我们识别出几类转录前和转录后的调控基序,并高度自信地预测单个基序实例。我们还研究了发现能力如何随所比较物种的分歧程度和数量而变化,并为比较研究提供了一般指导原则。
Nature. 2007-11-8
Cell. 2007-12-28
PLoS Biol. 2012-11-6
Nat Genet. 2010-1
J Mol Evol. 2009-10-27
Genome Biol Evol. 2010-7-12
Pediatr Investig. 2023-11-21
Life Sci Alliance. 2024-2
Front Cell Infect Microbiol. 2023
Theory Biosci. 2005-4
PLoS Genet. 2007-11
Nature. 2007-11-8
Genome Res. 2007-12