Allen Andrew S, Satten Glen A
Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina 27710, USA.
Genet Epidemiol. 2008 Jan;32(1):29-40. doi: 10.1002/gepi.20259.
Haplotype-based analyses are thought to play a major role in the study of common complex diseases. This has led to the development of a variety of statistical methods for detecting disease-haplotype associations from case-control study data. However, haplotype phase is often uncertain when only genotype data is available. Methods that account for haplotype ambiguity by modeling the distribution of haplotypes can, if this distribution is misspecified, lead to substantial bias in parameter estimates even when complete genotype data is available. Here we study estimators that can be derived from score functions of appropriate likelihoods. We use the efficient score approach to estimation in the presence of nuisance parameters to a derive novel estimators that are robust to the haplotype distribution. We establish key relationships between estimators and study their empirical performance via simulation.
基于单倍型的分析被认为在常见复杂疾病的研究中发挥着重要作用。这促使人们开发了多种统计方法,用于从病例对照研究数据中检测疾病与单倍型的关联。然而,当仅可获得基因型数据时,单倍型相位往往不确定。通过对单倍型分布进行建模来考虑单倍型模糊性的方法,如果这种分布被错误指定,即使在有完整基因型数据的情况下,也可能导致参数估计出现实质性偏差。在此,我们研究可从适当似然函数的得分函数推导得出的估计量。我们使用有效得分方法在存在干扰参数的情况下进行估计,以推导对单倍型分布具有稳健性的新型估计量。我们建立了估计量之间的关键关系,并通过模拟研究它们的实证性能。