Stram Daniel O, Haiman Christopher A, Hirschhorn Joel N, Altshuler David, Kolonel Laurence N, Henderson Brian E, Pike Malcolm C
Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, Calif. 90033, USA.
Hum Hered. 2003;55(1):27-36. doi: 10.1159/000071807.
We describe an approach for picking haplotype-tagging single nucleotide polymorphisms (htSNPs) that is presently being taken in two large nested case-control studies within a multiethnic cohort (MEC), which are engaged in a search for associations between risk of prostate and breast cancer and common genetic variations in candidate genes. Based on a preliminary sample of 70 control subjects chosen at random from each of the 5 ethnic groups in the MEC we estimate haplotype frequencies using a variant of the Excoffier-Slatkin E-M algorithm after genotyping a high density of SNPs selected every 3-5 kb in and surrounding a candidate gene. In order to evaluate the performance of a candidate set of htSNPS (which will be genotyped in the much larger case-control sample) we treat the haplotype frequencies estimate above as known, and carry out a formal calculation of the uncertainty of the number of copies of common haplotypes carried by an individual, summarizing this calculation as a coefficient of determination, R2h. A candidate set of htSNPS of a given size is chosen so as to maximize the minimum value of R2h over the common haplotypes, h.
我们描述了一种选择单倍型标签单核苷酸多态性(htSNP)的方法,目前在一个多民族队列(MEC)中的两项大型巢式病例对照研究中采用该方法,这两项研究旨在寻找前列腺癌和乳腺癌风险与候选基因中常见基因变异之间的关联。基于从MEC的5个种族群体中每个群体随机选择的70名对照受试者的初步样本,我们在对候选基因及其周围每隔3 - 5 kb选择的高密度SNP进行基因分型后,使用Excoffier - Slatkin E - M算法的变体来估计单倍型频率。为了评估一组候选htSNP(将在大得多的病例对照样本中进行基因分型)的性能,我们将上述单倍型频率估计值视为已知,并对个体携带的常见单倍型拷贝数的不确定性进行正式计算,将此计算总结为决定系数R2h。选择给定大小的一组候选htSNP,以便在常见单倍型h上最大化R2h的最小值。