Neale Michael C
Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond 23298, USA.
Twin Res. 2003 Jun;6(3):235-9. doi: 10.1375/136905203765693898.
Most analyses of data collected from a classical twin study of monozygotic (MZ) and dizygotic (DZ) twins assume that zygosity has been diagnosed without error. However, large scale surveys frequently resort to questionnaire-based methods of diagnosis which classify twins as MZ or DZ with less than perfect accuracy. This article describes a mixture distribution approach to the analysis of twin data when zygosity is not perfectly diagnosed. Estimates of diagnostic accuracy are used to weight the likelihood of the data according to the probability that any given pair is either MZ or DZ. The performance of this method is compared to fully accurate diagnosis, and to the analysis of samples that include some misclassified pairs. Conventional analysis of samples containing misclassified pairs yields biased estimates of variance components, such that additive genetic variance (A) is underestimated while common environment (C) and specific environment (E) components are overestimated. The bias is non-trivial; for 10% misclassification, true values of Additive genetic: Common environment: Specific Environment variance components of.6:.2:.2 are estimated as.48:.29:.23, respectively. The mixture distribution yields unbiased estimates, while showing relatively little loss of statistical precision for misclassification rates of 15% or less. The method is shown to perform quite well even when no information on zygosity is available, and may be applied when pair-specific estimates of zygosity probabilities are available.
大多数对从同卵(MZ)和异卵(DZ)双胞胎的经典双胞胎研究中收集的数据的分析都假定合子性已被无误诊断。然而,大规模调查经常采用基于问卷的诊断方法,这种方法将双胞胎分类为MZ或DZ时的准确率并非完美。本文描述了一种在合子性未被完美诊断时分析双胞胎数据的混合分布方法。诊断准确性的估计值用于根据任何给定双胞胎对是MZ或DZ的概率对数据的似然性进行加权。将该方法的性能与完全准确的诊断以及对包含一些误分类双胞胎对的样本的分析进行了比较。对包含误分类双胞胎对的样本进行传统分析会产生方差成分的有偏估计,即加性遗传方差(A)被低估,而共同环境(C)和特殊环境(E)成分被高估。这种偏差并非微不足道;对于10%的误分类,加性遗传:共同环境:特殊环境方差成分的真实值为0.6:0.2:0.2,分别被估计为0.48:0.29: .23。混合分布产生无偏估计,同时对于15%或更低的误分类率,统计精度的损失相对较小。即使没有合子性信息,该方法也表现得相当好,并且当有合子性概率的双胞胎对特定估计值时也可以应用。