Wellcome Sanger Institute, Hinxton, Cambridge, UK.
Present address: Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology, University of Trás-os-Montes and Alto Douro (UTAD), Vila Real, Portugal.
BMC Genomics. 2020 Jun 29;21(1):446. doi: 10.1186/s12864-020-06849-8.
Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised.
Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region.
We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.
人类基因组约有 5%显示出常见的结构变异,这些变异富含参与免疫反应和细胞间相互作用的基因。一个结构变异广泛的成熟区域是糖蛋白基因簇,由三个串联重复的区域组成,长度约为 120kb,携带高度同源的 GYPA、GYPB 和 GYPE 基因。糖蛋白 A(由 GYPA 编码)和糖蛋白 B(由 GYPB 编码)是红细胞表面高度表达的糖蛋白,它们被认为是病毒病原体的诱饵受体。它们是入侵原生物寄生虫疟原虫的受体,疟原虫是疟疾的病原体。一种特殊的复杂结构变体,称为 DUP4,创建了一个 GYPB-GYPA 融合基因,已知该基因赋予对疟疾的抗性。糖蛋白基因簇中存在许多其他结构变体,但它们的特征仍不清楚。
在这里,我们分析了来自全球的 3234 个人类二倍体基因组在糖蛋白基因座的结构变异,在 1000 基因组项目队列中证实了 15 个变体,发现了 9 个新变体,并使用纤维-FISH 和序列水平的断点映射对这些变体中的一些进行了特征描述。我们确定了一些预测会产生新融合基因的变体,以及在西非人群中具有相当频率的常见倒位重复变体。我们表明,几乎所有的变体都可以通过非等位基因同源重组来解释,并且通过将结构变体断点与重组热点图谱进行比较,证实了该区域中一个特定的减数分裂重组热点对结构变体形成的重要性。
我们在人类糖蛋白 A-B-E 基因簇中鉴定和验证了大型结构变体,这些变体可能与疟疾的不同临床方面有关。