Gordon Laurie, Yang Shan, Tran-Gyamfi Mary, Baggott Dan, Christensen Mari, Hamilton Aaron, Crooijmans Richard, Groenen Martien, Lucas Susan, Ovcharenko Ivan, Stubbs Lisa
Genome Biology Group, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
Genome Res. 2007 Nov;17(11):1603-13. doi: 10.1101/gr.6775107. Epub 2007 Oct 5.
The chicken genome draft sequence has provided a valuable resource for studies of an important agricultural and experimental model species and an important data set for comparative analysis. However, some of the most gene-rich segments are missing from chicken genome draft assemblies, limiting the analysis of a substantial number of genes and preventing a closer look at regions that are especially prone to syntenic rearrangements. To facilitate the functional and evolutionary analysis of one especially gene-rich, rearrangement-prone genomic region, we analyzed sequence from BAC clones spanning chicken microchromosome GGA28; as a complement we also analyzed a gene-sparse, stable region from GGA11. In these two regions we documented the conservation and lineage-specific gain and loss of protein-coding genes and precisely mapped the locations of 31 major human-chicken syntenic breakpoints. Altogether, we identified 72 lineage-specific genes, many of which are found at or near syntenic breaks, implicating evolutionary breakpoint regions as major sites of genetic innovation and change. Twenty-two of the 31 breakpoint regions have been reused repeatedly as rearrangement breakpoints in vertebrate evolution. Compared with stable GC-matched regions, GGA28 is highly enriched in CpG islands, as are break-prone intervals identified elsewhere in the chicken genome; evolutionary breakpoints are further enriched in GC content and CpG islands, highlighting a potential role for these features in genome instability. These data support the hypothesis that chromosome rearrangements have not occurred randomly over the course of vertebrate evolution but are focused preferentially within "fragile" regions with unusual DNA sequence characteristics.
鸡基因组草图序列为研究一种重要的农业和实验模型物种提供了宝贵资源,也是用于比较分析的重要数据集。然而,鸡基因组草图组装中缺少一些基因最丰富的片段,这限制了对大量基因的分析,并妨碍了对特别容易发生同线重排的区域进行更深入的研究。为了促进对一个特别基因丰富、易发生重排的基因组区域进行功能和进化分析,我们分析了跨越鸡微染色体GGA28的BAC克隆序列;作为补充,我们还分析了GGA11中一个基因稀少、稳定的区域。在这两个区域,我们记录了蛋白质编码基因的保守性以及特定谱系的得失情况,并精确绘制了31个主要人类-鸡同线断点的位置。我们总共鉴定出72个特定谱系基因,其中许多基因位于同线断点处或其附近,这表明进化断点区域是基因创新和变化的主要位点。在脊椎动物进化过程中,31个断点区域中有22个被反复用作重排断点。与稳定的GC匹配区域相比,GGA28富含CpG岛,鸡基因组其他地方鉴定出的易断裂区间也是如此;进化断点在GC含量和CpG岛中进一步富集,突出了这些特征在基因组不稳定性中的潜在作用。这些数据支持这样一种假设,即染色体重排在脊椎动物进化过程中并非随机发生,而是优先集中在具有异常DNA序列特征的“脆弱”区域内。