Alper Chester A, Larsen Charles E, Dubey Devendra P, Awdeh Zuheir L, Fici Dolores A, Yunis Edmond J
CBR Institute for Biomedical Research, and Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA.
Hum Immunol. 2006 Jan-Feb;67(1-2):73-84. doi: 10.1016/j.humimm.2005.11.006. Epub 2006 Apr 5.
There is great interest in the use of single-nucleotide polymorphisms (SNPs) and linkage disequilibrium (LD) analysis to localize human disease genes. The results suggest that the human genome, including the major histocompatibility complex (MHC), consists largely of 5- to 200-kb blocks of sequence fixity between which random recombination occurs. Direct determination of MHC haplotypes from family studies also demonstrates similar-sized blocks, but otherwise gives a very different picture, with a third to a half of Caucasian haplotypes fixed from HLA-B to HLA-DR/DQ (at least 1 Mb) as conserved extended haplotypes (CEHs), some of which encompass more than 3 Mb. These fixed haplotypes differ in frequency both in different Caucasian subpopulations and in Caucasian patients with HLA-associated diseases, complicating disease susceptibility gene localization. The inherent inability of LD analysis to "see" DNA fixity beyond three markers contributes to the failure of SNP/LD analysis to define in detail or even detect CEHs in the MHC and probably elsewhere in the genome. More importantly, the use of statistical analysis, rather than direct haplotype determination and counting, fails to reveal the details of haplotype structure essential for gene localization. Given the oversimplified picture of the MHC (and probably the rest of the genome) provided only by SNP/LD-defined blocks, it is questionable whether this approach will be of great help in disease susceptibility gene localization or identification.
利用单核苷酸多态性(SNP)和连锁不平衡(LD)分析来定位人类疾病基因引起了人们极大的兴趣。结果表明,包括主要组织相容性复合体(MHC)在内的人类基因组在很大程度上由5至200 kb的序列固定区组成,这些区域之间会发生随机重组。通过家系研究直接确定MHC单倍型也显示出类似大小的区域,但除此之外,情况却大不相同,三分之一到一半的高加索人单倍型从HLA - B到HLA - DR/DQ(至少1 Mb)作为保守扩展单倍型(CEH)是固定的,其中一些包含超过3 Mb。这些固定的单倍型在不同的高加索亚人群体以及患有HLA相关疾病的高加索患者中频率不同,这使得疾病易感基因的定位变得复杂。LD分析本身无法“看到”超过三个标记的DNA固定性,这导致SNP/LD分析无法详细定义甚至检测MHC以及基因组其他地方的CEH。更重要的是,使用统计分析而非直接确定和计数单倍型,无法揭示基因定位所需的单倍型结构细节。鉴于仅由SNP/LD定义的区域所提供的MHC(可能还有基因组的其他部分)过于简化的情况,这种方法在疾病易感基因定位或识别中是否会有很大帮助值得怀疑。