Max Planck Institute for Molecular Genetics, RG Development & Disease, Berlin, Germany.
Institute for Medical and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.
Nat Commun. 2022 Oct 29;13(1):6470. doi: 10.1038/s41467-022-34053-7.
Structural variants are a common cause of disease and contribute to a large extent to inter-individual variability, but their detection and interpretation remain a challenge. Here, we investigate 11 individuals with complex genomic rearrangements including germline chromothripsis by combining short- and long-read genome sequencing (GS) with Hi-C. Large-scale genomic rearrangements are identified in Hi-C interaction maps, allowing for an independent assessment of breakpoint calls derived from the GS methods, resulting in >300 genomic junctions. Based on a comprehensive breakpoint detection and Hi-C, we achieve a reconstruction of whole rearranged chromosomes. Integrating information on the three-dimensional organization of chromatin, we observe that breakpoints occur more frequently than expected in lamina-associated domains (LADs) and that a majority reshuffle topologically associating domains (TADs). By applying phased RNA-seq, we observe an enrichment of genes showing allelic imbalanced expression (AIG) within 100 kb around the breakpoints. Interestingly, the AIGs hit by a breakpoint (19/22) display both up- and downregulation, thereby suggesting different mechanisms at play, such as gene disruption and rearrangements of regulatory information. However, the majority of interpretable genes located 200 kb around a breakpoint do not show significant expression changes. Thus, there is an overall robustness in the genome towards large-scale chromosome rearrangements.
结构变异是疾病的常见原因,在很大程度上导致了个体间的变异性,但它们的检测和解释仍然是一个挑战。在这里,我们通过结合短读和长读基因组测序(GS)与 Hi-C 技术,研究了 11 名具有复杂基因组重排的个体,包括生殖系染色体重排。在 Hi-C 相互作用图谱中识别出大规模的基因组重排,这使得可以对来自 GS 方法的断点调用进行独立评估,从而产生了>300 个基因组连接点。基于全面的断点检测和 Hi-C,我们实现了整个重排染色体的重建。整合染色质三维组织的信息,我们观察到断点比预期更频繁地发生在核纤层相关域(LAD)中,并且大多数拓扑关联域(TAD)发生重排。通过应用相分离的 RNA-seq,我们观察到在断点周围 100kb 范围内,具有等位基因不平衡表达(AIG)的基因富集。有趣的是,被断点击中的 AIGs(19/22)显示出上调和下调,从而表明存在不同的作用机制,如基因破坏和调节信息的重排。然而,位于断点周围 200kb 处的大多数可解释基因没有显示出显著的表达变化。因此,基因组对大规模染色体重排具有整体稳健性。