Webster Matthew T, Clegg John B, Harding Rosalind M
MRC Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Headington, Oxford, OX3 9DS, UK.
Hum Genet. 2003 Jul;113(2):123-39. doi: 10.1007/s00439-003-0954-0. Epub 2003 May 8.
Blocks of linkage disequilibrium (LD) in the human genome represent segments of ancestral chromosomes. To investigate the relationship between LD and genealogy, we analysed diversity associated with restriction fragment length polymorphism (RFLP) haplotypes of the 5' beta-globin gene complex. Genealogical analyses were based on sequence alleles that spanned a 12.2-kb interval, covering 3.1 kb around the psibeta gene and 6.2 kb of the delta-globin gene and its 5' flanking sequence known as the R/T region. Diversity was sampled from a Kenyan Luo population where recent malarial selection has contributed to substantial LD. A single common sequence allele spanning the 12.2-kb interval exclusively identified the ancestral chromosome bearing the "Bantu" beta(s) (sickle-cell) RFLP haplotype. Other common 5' RFLP haplotypes comprised interspersed segments from multiple ancestral chromosomes. Nucleotide diversity was similar between psibeta and R/T-delta-globin but was non-uniformly distributed within the R/T-delta-globin region. High diversity associated with the 5' R/T identified two ancestral lineages that probably date back more than 2 million years. Within this genealogy, variation has been introduced into the 3' R/T by gene conversion from other ancestral chromosomes. Diversity in delta-globin was found to lead through parts of the main genealogy but to coalesce in a more recent ancestor. The well-known recombination hotspot is clearly restricted to the region 3' of delta-globin. Our analyses show that, whereas one common haplotype in a block of high LD represents a long segment from a single ancestral chromosome, others are mosaics of short segments from multiple ancestors related in genealogies of unsuspected complexity.
人类基因组中的连锁不平衡(LD)区域代表了祖先染色体的片段。为了研究LD与谱系之间的关系,我们分析了5'β-珠蛋白基因复合体的限制性片段长度多态性(RFLP)单倍型相关的多样性。谱系分析基于跨越12.2 kb区间的序列等位基因,涵盖ψβ基因周围3.1 kb、δ-珠蛋白基因及其5'侧翼序列(称为R/T区域)的6.2 kb。多样性样本取自肯尼亚的卢奥人群,近期的疟疾选择导致了大量的LD。一个跨越12.2 kb区间的单一常见序列等位基因专门鉴定出携带“班图”β(s)(镰状细胞)RFLP单倍型的祖先染色体。其他常见的5'RFLP单倍型由来自多个祖先染色体的穿插片段组成。ψβ和R/T-δ-珠蛋白之间的核苷酸多样性相似,但在R/T-δ-珠蛋白区域内分布不均匀。与5'R/T相关的高多样性确定了两个可能追溯到200多万年前的祖先谱系。在这个谱系中,通过其他祖先染色体的基因转换,变异被引入到3'R/T中。发现δ-珠蛋白的多样性在主要谱系的部分区域中延续,但在更近的祖先中合并。众所周知的重组热点明显局限于δ-珠蛋白3'端区域。我们的分析表明,虽然高LD区域中的一个常见单倍型代表来自单个祖先染色体的长片段,但其他单倍型是来自多个祖先的短片段的镶嵌体,这些祖先在谱系上具有意想不到的复杂性。