Laboratory for Genetic Epidemiology, Western Australian Institute for Medical Research, UWA Centre for Medical Research, University of Western Australia, Ground Floor, B Block, Hospital Avenue, Nedlands, Australia.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S151. doi: 10.1186/1471-2156-6-S1-S151.
We used our newly developed linkage disequilibrium (LD) plotting software, JLIN, to plot linkage disequilibrium between pairs of single-nucleotide polymorphisms (SNPs) for three chromosomes of the Genetic Analysis Workshop 14 Aipotu simulated population to assess the effect of missing data on LD calculations. Our haplotype analysis program, SIMHAP, was used to assess the effect of missing data on haplotype-phenotype association. Genotype data was removed at random, at levels of 1%, 5%, and 10%, and the LD calculations and haplotype association results for these levels of missingness were compared to those for the complete dataset. It was concluded that ignoring individuals with missing data substantially affects the number of regions of LD detected which, in turn, could affect tagging SNPs chosen to generate haplotypes.
我们使用新开发的连锁不平衡(LD)绘图软件 JLIN,对遗传分析工作坊 14 的 Aipotu 模拟群体的三个染色体上的单核苷酸多态性(SNP)对之间的连锁不平衡进行绘图,以评估缺失数据对 LD 计算的影响。我们的单倍型分析程序 SIMHAP 用于评估缺失数据对单倍型 - 表型关联的影响。随机删除基因型数据,缺失水平分别为 1%、5%和 10%,并将这些缺失水平的 LD 计算和单倍型关联结果与完整数据集的结果进行比较。结论是,忽略缺失数据的个体将极大地影响检测到的 LD 区域数量,进而影响用于生成单倍型的标记 SNP 的选择。