Yuan Ao, Chen Guanjie, Chen Yuanxiu, Rotimi Charles, Bonney George E
Statistical Genetics and Bioinformatics Unit, National Human Genome Center, Howard University, Washington, DC 20059, USA.
Genetics. 2004 Jul;167(3):1445-59. doi: 10.1534/genetics.103.021600.
There are generally three steps to isolate a disease linkage-susceptibility gene: genome-wide scan, fine mapping, and, last, positional cloning. The last step is time consuming and involves intensive laboratory work. In some cases, fine mapping cannot proceed further on a set of markers because they are tightly linked. For years, genetic statisticians have been trying different ways to narrow the fine-mapping results to provide some guidance for the next step of laboratory work. Although these methods are practical and efficient, most of them are based on IBD data, which usually can be inferred only from the genotype data with some uncertainty. The corresponding methods thus have no greater power than one using genotype data directly. Also, IBD-based methods apply only to relative pair data. Here, using genotype data, we have developed a statistical hypothesis-testing method to pinpoint a SNP, or SNPs, suspected of responsibility for a disease trait linkage among a set of SNPs tightly linked in a region. Our method uses genotype data of affected individuals or case-control studies, which are widely available in the laboratory. The testing statistic can be constructed using any genotype-based disease-marker disequilibrium measure and is asymptotically distributed as a chi-square mixture. This method can be used for singleton data, relative pair data, or general pedigree data. We have applied the method to simulated data as well as a real data set; it gives satisfactory results.
全基因组扫描、精细定位,最后是位置克隆。最后一步耗时且涉及大量实验室工作。在某些情况下,由于一组标记紧密连锁,精细定位无法在这些标记上进一步推进。多年来,遗传统计学家一直在尝试不同方法来缩小精细定位结果,为下一步实验室工作提供一些指导。尽管这些方法实用且高效,但大多数基于IBD数据,而IBD数据通常只能从基因型数据中带有一定不确定性地推断出来。因此,相应方法并不比直接使用基因型数据的方法更具效力。此外,基于IBD的方法仅适用于相对对数据。在此,我们利用基因型数据开发了一种统计假设检验方法,以在一个区域紧密连锁的一组单核苷酸多态性(SNP)中确定一个或多个涉嫌与疾病性状连锁相关的SNP。我们的方法使用受影响个体的基因型数据或病例对照研究数据,这些数据在实验室中广泛可得。检验统计量可以使用任何基于基因型的疾病标记不平衡度量来构建,并且渐近分布为卡方混合分布。该方法可用于单例数据、相对对数据或一般家系数据。我们已将该方法应用于模拟数据以及一个真实数据集;结果令人满意。