Wiltshire Steven, Cardon Lon R, McCarthy Mark I
Imperial College Genetics and Genomics Research Institute, Imperial College, London, United Kingdom.
Am J Hum Genet. 2002 Nov;71(5):1175-82. doi: 10.1086/342976. Epub 2002 Sep 25.
The evaluation of results from primary genomewide linkage scans of complex human traits remains an area of importance and considerable debate. Apart from the usual assessment of statistical significance by use of asymptotic and empirical calculations, an additional means of evaluation--based on counting the number of distinct regions showing evidence of linkage--is possible. We have explored the characteristics of such a locus-counting method over a range of experimental conditions typically encountered during genomewide scans for complex trait loci. Under the null hypothesis, factors that have an impact on the informativeness of the data--such as map density, availability of parental data, and completeness of genotyping--are seen to markedly influence the number of regions of excess allele sharing and the empirically derived genomewide significance of the associated LOD score thresholds. In some circumstances, the expected number of regions is less than one-quarter of that predicted under the assumption of a dense map and complete extraction of inheritance information. We have applied this method to a previously analyzed data set--the Warren 2 genome scan for type 2-diabetes susceptibility--and demonstrate that more regions showing evidence for linkage were observed in the primary genome scan than would be expected by chance, across the whole range of LOD scores, even though no single linkage result achieved empirical genomewide statistical significance. Locus counting may be useful in assessing the results from genome scans for complex traits in general, especially because relatively few scans generate evidence for linkage reaching genomewide significance by dense-map criteria. By taking account of the effects of reduced data informativeness on the expected number of regions showing evidence for linkage, a more meaningful, and less conservative, evaluation of the results from such linkage studies is possible.
对复杂人类性状进行全基因组连锁扫描的结果评估,仍然是一个重要且存在大量争议的领域。除了通过渐近计算和经验计算对统计显著性进行常规评估外,还可以采用另一种评估方法——基于计算显示连锁证据的不同区域数量。我们已经在全基因组扫描复杂性状基因座时通常遇到的一系列实验条件下,探索了这种基因座计数方法的特征。在零假设下,影响数据信息量的因素——如图谱密度、亲本数据的可用性以及基因分型的完整性——被发现会显著影响等位基因共享过量区域的数量以及相关LOD得分阈值的经验性全基因组显著性。在某些情况下,预期的区域数量不到在密集图谱假设和完整提取遗传信息的情况下所预测数量的四分之一。我们已将此方法应用于一个先前分析过的数据集——沃伦2型2型糖尿病易感性全基因组扫描数据集,并证明在全基因组扫描的主要阶段,在整个LOD得分范围内,观察到的显示连锁证据的区域比偶然预期的更多,尽管没有单个连锁结果达到经验性全基因组统计显著性。基因座计数一般可能有助于评估复杂性状全基因组扫描的结果,特别是因为相对较少的扫描能产生符合密集图谱标准的达到全基因组显著性的连锁证据。通过考虑数据信息量减少对显示连锁证据的预期区域数量的影响,有可能对这类连锁研究的结果进行更有意义且不那么保守的评估。