Greenberg D A, Hodge S E
Am J Med Genet. 1985 Jun;21(2):357-71. doi: 10.1002/ajmg.1320210219.
The purposes of this work were 1) to reparameterize the likelihood used in segregation analysis in a way particularly suited to detecting heterogeneity (the result of the analysis is a parameter giving the proportion of families with the genetic form of the disease in the dataset) and 2) to test how well this reparameterization works using simulation. We assume that a dataset contains nuclear family data, with some of the families having a form of the disease that is environmentally caused and the others with a genetic form of the disease. In this study, we considered the case where the genetic form is a simple recessive and the environmental form a random model. The underlying parameters were the gene frequency, q, and the frequency of sporadics, R. We reparameterized the likelihood in terms of alpha, the percentage of genetic families in the dataset, which we attempt to estimate. We contrast the estimates of alpha with the population heterogeneity as reflected in the estimates of q and R. For the simulation, nuclear families are generated. Genetic families were simulated with a mendelian recessive pattern and environmental families according to a simple random model. Over a wide range of generating parameters, estimates of alpha were good, differing from the "true" values by only a few percent. Estimates of q and R, on the other hand, ranged from fair to poor. Our results indicate that the amount of heterogeneity in a dataset can be accurately estimated using segregation analysis, even when estimates of the gene frequency and penetrance among sporadics are unreliable.
1)以一种特别适合检测异质性的方式对分离分析中使用的似然性进行重新参数化(分析结果是一个参数,给出数据集中患有该疾病遗传形式的家庭比例);2)通过模拟测试这种重新参数化的效果如何。我们假设一个数据集包含核心家庭数据,其中一些家庭的疾病形式是由环境引起的,而其他家庭则患有该疾病的遗传形式。在本研究中,我们考虑了遗传形式为简单隐性遗传且环境形式为随机模型的情况。潜在参数是基因频率q和散发病例的频率R。我们根据α(数据集中遗传家庭的百分比,我们试图对其进行估计)对似然性进行了重新参数化。我们将α的估计值与q和R估计值所反映的总体异质性进行对比。对于模拟,生成了核心家庭。遗传家庭按照孟德尔隐性模式进行模拟,环境家庭则根据简单随机模型进行模拟。在广泛的生成参数范围内,α的估计值良好,与“真实”值的差异仅为几个百分点。另一方面,q和R的估计值则从一般到较差。我们的结果表明,即使在散发病例的基因频率和外显率估计不可靠的情况下,使用分离分析也可以准确估计数据集中的异质性程度。