Department of Genome Sciences, University of Washington School of Medicine, Seattle, 98195, USA.
Cell. 2010 Nov 24;143(5):837-47. doi: 10.1016/j.cell.2010.10.027.
Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2081 breakpoint junctions and infer potential mechanism of origin. Three mechanisms account for the bulk of germline structural variation: microhomology-mediated processes involving short (2-20 bp) stretches of sequence (28%), nonallelic homologous recombination (22%), and L1 retrotransposition (19%). The high quality and long-range continuity of the sequence reveals more complex mutational mechanisms, including repeat-mediated inversions and gene conversion, that are most often missed by other methods, such as comparative genomic hybridization, single nucleotide polymorphism microarrays, and next-generation sequencing.
要了解导致人类基因组结构变异的主要突变机制,需要在等位变异的发现方面保持一致性,并在断点划定方面保持精确性。我们开发了一个基于从 17 个人类基因组中进行毛细管末端测序的资源,对 1054 个大型结构变异的完整序列进行了特征描述,这些变异对应于 589 个缺失、384 个插入和 81 个倒位。我们分析了 2081 个断点连接,并推断出潜在的起源机制。有三种机制导致了生殖系结构变异的大部分:涉及短(2-20 个碱基)序列的微同源介导过程(28%)、非等位同源重组(22%)和 L1 反转录(19%)。序列的高质量和长程连续性揭示了更复杂的突变机制,包括重复介导的反转和基因转换,这些机制通常会被其他方法(如比较基因组杂交、单核苷酸多态性微阵列和下一代测序)所忽略。