Suppr超能文献

纳入核心家庭信息时单倍型频率估计的效率。

Efficiency of haplotype frequency estimation when nuclear family information is included.

作者信息

Becker Tim, Knapp Michael

机构信息

Institute for Medical Biometry, Informatics and Epidemiology, University of Bonn, Bonn, Germany.

出版信息

Hum Hered. 2002;54(1):45-53. doi: 10.1159/000066692.

Abstract

In genetic studies the haplotype structure of the regarded population is expected to carry important information. Experimental methods to derive haplotypes, however, are expensive and none of them has yet become standard methodology. On the other hand, maximum likelihood haplotype estimation from unphased individual genotypes may incur inaccuracies. We therefore investigated the relative efficiency of haplotype frequency estimation when nuclear family information is included compared to estimation from experimentally derived haplotypes. Efficiency was measured in terms of variance ratios of the estimates. The variances were derived from the binomial distribution for experimentally derived haplotypes, and from the Fisher information matrix corresponding to the general likelihood function of the haplotype frequency parameters, including family information. We subsequently compared these variance ratios to the variance ratios for the case of estimation from individual genotypes. We found that the information gained from a single child compensates missing phase information to a high degree, resulting in estimates almost as reliable as those derived from observed haplotypes. Thus, if children have already been genotyped for other reasons, it is highly recommendable to include them into the estimation. If child information is not already present, it depends on the number of loci and the haplotype diversity if it is useful to genotype a single child just to reduce phase ambiguity. In general, if the number of loci is less than or equal to three or if the number of haplotypes with a frequency >5% is less than or equal to four, haplotype estimation from individuals is quite good already and the improvement gained from a single child can not compensate the genotyping effort for it. On the other hand, under scenarios with many loci and high haplotype diversity, haplotype frequency estimation from trios can be more efficient than haplotype frequency estimation from individuals also on a per genotype base.

摘要

在基因研究中,目标人群的单倍型结构有望携带重要信息。然而,用于推导单倍型的实验方法成本高昂,且尚无一种成为标准方法。另一方面,从未分型的个体基因型中进行最大似然单倍型估计可能会产生不准确的结果。因此,我们研究了在纳入核心家庭信息时,与从实验推导的单倍型进行估计相比,单倍型频率估计的相对效率。效率通过估计值的方差比来衡量。方差是从实验推导的单倍型的二项分布中得出的,以及从与单倍型频率参数的一般似然函数相对应的费舍尔信息矩阵中得出的,其中包括家庭信息。随后,我们将这些方差比与从个体基因型进行估计的情况的方差比进行了比较。我们发现,从一个孩子那里获得的信息在很大程度上弥补了缺失的相位信息,从而得到的估计结果几乎与从观察到的单倍型中得出的结果一样可靠。因此,如果出于其他原因已经对孩子进行了基因分型,强烈建议将他们纳入估计中。如果尚未有孩子的信息,是否仅为了减少相位模糊性而对一个孩子进行基因分型则取决于基因座的数量和单倍型多样性。一般来说,如果基因座的数量小于或等于三个,或者频率>5%的单倍型数量小于或等于四个,从个体进行单倍型估计已经相当不错了,从一个孩子那里获得的改进无法弥补对其进行基因分型的工作量。另一方面,在基因座数量众多且单倍型多样性高的情况下,从三联体进行单倍型频率估计在每个基因型基础上也可能比从个体进行单倍型频率估计更有效。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验