Martin Eden R, Bass Meredyth P, Hauser Elizabeth R, Kaplan Norman L
Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA.
Am J Hum Genet. 2003 Nov;73(5):1016-26. doi: 10.1086/378779. Epub 2003 Oct 9.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.
在复杂疾病的研究中,一个常见的模式是在连锁分析所确定区域的标记处进行关联分析,以试图缩小感兴趣的区域。在精细定位研究中,经常使用基于父母向患病后代传递情况的基于家系的关联检验。然而,对于晚发性疾病,父母的基因型往往缺失。没有父母的基因型,基于家系的检验要么比较患病个体与其未患病同胞的等位基因频率,要么利用同胞来推断缺失的父母基因型。后一种方法的一个例子是计算机程序TRANSMIT中实现的计分检验。TRANSMIT中对缺失父母基因型的推断假定从父母到患病同胞的传递是独立的,这在没有连锁的情况下是合适的。然而,通过计算机模拟,我们表明,当标记与疾病位点连锁且数据集由有多个患病同胞的家系组成时,在标记与疾病等位基因之间无关联的零假设下,这一假设会导致计分统计量出现偏差。这种偏差导致连锁区域中计分检验的I型错误率膨胀。我们提出了一种在存在连锁情况下的新型关联检验(APL),它通过估计同源参数来正确推断连锁区域中缺失的父母基因型,以调整从父母到患病同胞传递之间的相关性。在模拟数据中,我们证明了在无关联零假设下APL检验的有效性,并表明该检验可能比系谱不平衡检验和基于家系的关联检验更具效力。例如,我们在帕金森病家系的候选基因研究中比较了这些检验的性能。