Lindsay Sarah J, Khajavi Mehrdad, Lupski James R, Hurles Matthew E
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.
Am J Hum Genet. 2006 Nov;79(5):890-902. doi: 10.1086/508709. Epub 2006 Sep 26.
Insights into the origins of structural variation and the mutational mechanisms underlying genomic disorders would be greatly improved by a genomewide map of hotspots of nonallelic homologous recombination (NAHR). Moreover, our understanding of sequence variation within the duplicated sequences that are substrates for NAHR lags far behind that of sequence variation within the single-copy portion of the genome. Perhaps the best-characterized NAHR hotspot lies within the 24-kb-long Charcot-Marie-Tooth disease type 1A (CMT1A)-repeats (REPs) that sponsor deletions and duplications that cause peripheral neuropathies. We investigated structural and sequence diversity within the CMT1A-REPs, both within and between species. We discovered a high frequency of retroelement insertions, accelerated sequence evolution after duplication, extensive paralogous gene conversion, and a greater than twofold enrichment of SNPs in humans relative to the genome average. We identified an allelic recombination hotspot underlying the known NAHR hotspot, which suggests that the two processes are intimately related. Finally, we used our data to develop a novel method for inferring the location of an NAHR hotspot from sequence variation within segmental duplications and applied it to identify a putative NAHR hotspot within the LCR22 repeats that sponsor velocardiofacial syndrome deletions. We propose that a large-scale project to map sequence variation within segmental duplications would reveal a wealth of novel chromosomal-rearrangement hotspots.
非等位基因同源重组(NAHR)热点的全基因组图谱将极大地增进我们对结构变异起源以及基因组疾病潜在突变机制的理解。此外,我们对作为NAHR底物的重复序列内序列变异的了解,远远落后于对基因组单拷贝部分内序列变异的了解。也许最具特征的NAHR热点位于24 kb长的1A型夏科 - 马里 - 图斯病(CMT1A)重复序列(REPs)内,该重复序列会引发导致周围神经病变的缺失和重复。我们研究了CMT1A - REPs内物种内部和物种之间的结构和序列多样性。我们发现反转录元件插入频率很高,重复后序列进化加速,广泛的旁系同源基因转换,以及人类单核苷酸多态性(SNP)相对于基因组平均水平有两倍以上的富集。我们确定了已知NAHR热点下方的一个等位基因重组热点,这表明这两个过程密切相关。最后,我们利用我们的数据开发了一种从片段重复内的序列变异推断NAHR热点位置的新方法,并将其应用于识别在引发腭心面综合征缺失的LCR22重复序列内的一个假定NAHR热点。我们建议开展一个大规模项目来绘制片段重复内的序列变异图谱,这将揭示大量新的染色体重排热点。