Wilcox Marsha A, Pugh Elizabeth W, Zhang Heping, Zhong Xiaoyun, Levinson Douglas F, Kennedy Giulia C, Wijsman Ellen M
Genetics Program, Division of Graduate Medical Sciences, Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA.
Genet Epidemiol. 2005;29 Suppl 1:S7-28. doi: 10.1002/gepi.20106.
The papers in presentation groups 1-3 of Genetic Analysis Workshop 14 (GAW14) compared microsatellite (MS) markers and single-nucleotide polymorphism (SNP) markers for a variety of factors, using multiple methods in both data sets provided to GAW participants. Group 1 focused on data provided from the Collaborative Study on the Genetics of Alcoholism (COGA). Group 2 focused on data simulated for the workshop. Group 3 contained analyses of both data sets. Issues examined included: information content, signal strength, localization of the signal, use of haplotype blocks, population structure, power, type I error, control of type I error, the effect of linkage disequilibrium, and computational challenges. There were several broad resulting observations. 1) Information content was higher for dense SNP marker panels than for MS panels, and dense SNP markers sets appeared to provide slightly higher linkage scores and slightly higher power to detect linkage than MS markers. 2) Dense SNP panels also gave higher type I errors, suggesting that increased test thresholds may be needed to maintain the correct error rate. 3) Dense SNP panels provided better trait localization, but only in the COGA data, in which the MS markers were relatively loosely spaced. 4) The strength of linkage signals did not vary with the density of SNP panels, once the marker density was approximately 1 SNP/cM. 5) Analyses with SNPs were computationally challenging, and identified areas where improvements in analysis tools will be necessary to make analysis practical for widespread use.
遗传分析研讨会14(GAW14)的第1 - 3展示组中的论文,在提供给GAW参与者的两个数据集中使用多种方法,比较了微卫星(MS)标记和单核苷酸多态性(SNP)标记的各种因素。第1组专注于酒精中毒遗传学合作研究(COGA)提供的数据。第2组专注于为研讨会模拟的数据。第3组包含对两个数据集的分析。研究的问题包括:信息含量、信号强度、信号定位、单倍型块的使用、群体结构、效能、I型错误、I型错误的控制、连锁不平衡的影响以及计算挑战。由此得出了几个广泛的观察结果。1)密集SNP标记面板的信息含量高于MS面板,并且密集SNP标记集似乎比MS标记提供略高的连锁分数和略高的检测连锁的效能。2)密集SNP面板也给出了更高的I型错误,这表明可能需要提高检验阈值以维持正确的错误率。3)密集SNP面板提供了更好的性状定位,但仅在COGA数据中如此,其中MS标记的间距相对较宽。4)一旦标记密度约为1个SNP/cM,连锁信号的强度不会随SNP面板的密度而变化。5)使用SNP进行分析在计算上具有挑战性,并确定了分析工具需要改进的领域,以使分析实际适用于广泛应用。