Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195.
Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195.
Proc Natl Acad Sci U S A. 2018 May 8;115(19):E4433-E4442. doi: 10.1073/pnas.1717600115. Epub 2018 Apr 23.
Structural variation and single-nucleotide variation of the complement factor H () gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ∼360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four -related () gene paralogs ( and ∼25-35 Mya and and ∼7-13 Mya). Remarkably, all evolutionary breakpoints share a common ∼4.8-kbp segment corresponding to an ancestral gene promoter that has expanded independently throughout primate evolution. This segment is recurrently reused and juxtaposed with a donor duplication containing exons 8 and 9 from ancestral , creating four fusion genes that include lineage-specific members of the gene family. Combined analysis of >5,000 AMD cases and controls identifies a significant burden of a rare missense mutation that clusters at the N terminus of [ = 5.81 × 10, odds ratio (OR) = 9.8 (3.67-Infinity)]. A bipolar clustering pattern of rare nonsynonymous mutations in patients with AMD ( < 10) and AHUS ( = 0.0079) maps to functional domains that show evidence of positive selection during primate evolution. Our structural variation analysis in >2,400 individuals reveals five recurrent rearrangement breakpoints that show variable frequency among AMD cases and controls. These data suggest a dynamic and recurrent pattern of mutation critical to the emergence of new genes but also in the predisposition to complex human genetic disease phenotypes.
结构变异和单核苷酸变异的补体因子 H()基因家族是几种复杂遗传疾病的基础,包括年龄相关性黄斑变性(AMD)和非典型溶血性尿毒症综合征(AHUS)。为了了解其多样性和进化,我们对 6 个灵长类动物谱系中的这个约 360kbp 基因座进行了高质量的测序,包括多个人类单倍型。比较序列分析揭示了两个不同的基因复制时期,导致了四个-相关()基因的出现()基因旁系同源物(和 25-35 Mya 和 和 7-13 Mya)。值得注意的是,所有进化断点都共享一个共同的约 4.8kbp 片段,对应于一个祖先基因启动子,该启动子在整个灵长类动物进化过程中独立扩展。这个片段被反复重新使用,并与一个包含来自祖先的外显子 8 和 9 的供体重复并列,产生了四个融合基因,包括基因家族的谱系特异性成员。对超过 5000 例 AMD 病例和对照的联合分析确定了一个罕见错义突变的显著负担,该突变聚集在的 N 末端()[= 5.81 × 10,比值比(OR) = 9.8(3.67-无限)]。AMD( < 10)和 AHUS(= 0.0079)患者中罕见非同义突变的双极聚类模式映射到功能域,这些功能域在灵长类动物进化过程中显示出正选择的证据。我们在超过 2400 个人中的结构变异分析揭示了五个反复出现的重排断点,这些断点在 AMD 病例和对照中显示出不同的频率。这些数据表明,突变的动态和反复模式对于新的出现至关重要,但也对复杂的人类遗传疾病表型的易感性有影响。