Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah 84108-1266, USA.
Genet Epidemiol. 2009 Nov;33(7):628-36. doi: 10.1002/gepi.20414.
We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.
我们提出了“sumLINK”统计量——在给定基因座处具有名义上显著连锁证据的亚系的多点 LOD 得分之和——作为一种替代方法,用于在存在异质性的情况下识别易感基因座。我们还建议使用“sumLOD”统计量(阳性多点 LOD 得分之和)作为 sumLINK 的补充。sumLINK 分析确定了遗传区域在家族之间具有极端一致性,而不考虑来自不相关或无信息家族的阴性证据。通过基于基因组洗牌的创新置换程序确定显著性,该程序将连锁信息随机化到家族之间。这种生成经验性零分布的程序可能对其他基于连锁的统计量也有用。使用 500 个模拟零数据的全基因组分析,我们表明基因组洗牌程序对于 sumLINK 和 sumLOD 都产生了正确的第一类错误率。使用来自 GAW13 的替代假设的 100 组模拟全基因组数据测试了统计数据的功效。最后,我们在来自国际前列腺癌遗传协作组的 190 个侵袭性前列腺癌家族的分析中说明了这些统计数据,我们在该分析中确定了一个新的易感基因座。我们提出 sumLINK 和 sumLOD 非常适合合作项目和荟萃分析,因为它们不需要参与机构之间共享可识别数据。此外,通过定义,几个连锁家族为每个峰贡献了多个连锁家族,因此使用 sumLINK 确定的基因座具有通过统计重组映射进行基因定位的良好潜力。