Song Kyunghee K, Weeks Daniel E, Sobel Eric, Feingold Eleanor
Department of Human Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA.
Genet Epidemiol. 2004 Feb;26(2):88-96. doi: 10.1002/gepi.10296.
In many genetic linkage analyses, the P value is obtained through simulation since the underlying distribution of the test statistic is complex and unknown. However, this can be very computationally intensive. A "bootstrap/replicate pool" approach has been suggested that generates P values more efficiently in terms of computation by resampling sums from a small set of simulated replicates for each pedigree. The replicate pool idea has been successfully applied, but, to our knowledge, has never been theoretically studied. An entirely different method for increasing the computational efficiency of P value simulation is Besag and Clifford's sequential sampling method. We propose an algorithm which combines Besag and Clifford's method with the replicate pool method to efficiently estimate P values for linkage studies. We derive variance expressions for the P value estimates from the replicate pool method and from our proposed hybrid method, and use these to show that the hybrid estimator has a substantial advantage over the other methods in most situations.
在许多基因连锁分析中,由于检验统计量的基础分布复杂且未知,P值是通过模拟获得的。然而,这在计算上可能非常密集。有人提出了一种“自助法/重复样本池”方法,通过对每个家系的一小组模拟重复样本的和进行重采样,在计算方面更有效地生成P值。重复样本池的想法已成功应用,但据我们所知,从未进行过理论研究。一种完全不同的提高P值模拟计算效率的方法是贝萨格和克利福德的序贯抽样方法。我们提出了一种算法,将贝萨格和克利福德的方法与重复样本池方法相结合,以有效地估计连锁研究的P值。我们推导了重复样本池方法和我们提出的混合方法的P值估计的方差表达式,并利用这些表达式表明,在大多数情况下,混合估计器比其他方法具有显著优势。