Windsor Aaron J, Schranz M Eric, Formanová Natasa, Gebauer-Jung Steffi, Bishop John G, Schnabelrauch Domenica, Kroymann Juergen, Mitchell-Olds Thomas
Max-Planck-Institut für chemische Okologie, D-07745 Jena, Germany.
Plant Physiol. 2006 Apr;140(4):1169-82. doi: 10.1104/pp.105.073981.
Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value < or = 10(-30)) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5' to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5' noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks.
比较基因组学有助于深入了解塑造离散序列以及整个基因组的进化动态。为了推动十字花科内部的比较基因组学研究,我们对拟南芥(Arabidopsis thaliana)的野生近缘种——岩生庭荠(Boechera stricta)的23,136个中等大小插入片段克隆进行了末端测序。这些序列中相当大一部分(18,797个)是无冗余的,并且与拟南芥低拷贝数基因组区域显示出高度显著的相似性(BLASTn e值≤10^(-30)),其中包括9,000多个注释编码序列。我们利用这个数据集来鉴定这两个物种中的直系同源基因对,并对注释编码区域5'端的DNA区域进行全局比较。平均而言,编码序列上游500个核苷酸在这两个物种之间显示出71.4%的同一性。在类似的分析中,甘蓝(Brassica oleracea)和拟南芥的5'非编码序列之间观察到61.4%的同一性,这表明这些谱系中的调控区域不像之前预期的那样分化明显。通过将岩生庭荠末端序列定位到拟南芥基因组上,我们鉴定出了近2,000个保守的微共线性块(覆盖了拟南芥基因组的26%)。对完全测序的岩生庭荠插入片段与其同源拟南芥基因组区域的比较表明,大于5 kb的插入缺失多态性对两个物种间观察到的基因组大小差异有很大贡献。此外,我们证明从末端序列数据推断出的微共线性可用于快速鉴定和克隆非模式物种中感兴趣的基因组区域。这些结果表明,在拟南芥的二倍体近缘种中,中小规模的鸟枪法测序方法可为进化和/或功能比较基因组框架提供快速且经济高效的优势。