Suppr超能文献

模糊型与直接测量的单倍型频率的相对效率

Relative efficiency of ambiguous vs. directly measured haplotype frequencies.

作者信息

Schaid Daniel J

机构信息

Departments of Health Sciences Research and Medical Genetics, Mayo Clinic/Foundation, Rochester, Minnesota 55905, USA.

出版信息

Genet Epidemiol. 2002 Nov;23(4):426-43. doi: 10.1002/gepi.10184.

Abstract

Haplotypes are useful for both fine-mapping of susceptibility loci and evaluation of sequence variation at multiple sites along a chromosome. However, they are difficult to directly measure over long stretches of DNA in diploid organisms. Consequently, multiple genetic markers are typically measured, without linkage phase information, giving rise to a subject's diplotype. From diplotype data, haplotypes are often inferred by pedigree information, or treated as partially missing data when haplotype frequencies are estimated among unrelated subjects. This latter ambiguity can increase the variance of the estimated haplotype frequencies. Douglas et al. ([2001] Nat. Genet. 28:361-364) recently quantified the relative efficiency of estimating haplotype frequencies from the diplotypes of unrelated subjects, relative to directly measured haplotypes via somatic cell hybrids (conversion technology), and demonstrated that unknown linkage phase can lead to a large loss of efficiency. However, their results were based on linkage equilibrium among marker loci, which may not be realistic for closely linked markers. We extend their relative efficiency calculations by several aspects: 1) allowance for linkage disequilbrium (LD) among marker loci; 2) evaluation of different patterns of LD; and 3) evaluation of nuclear families with and without parents. We show that although the loss in efficiency of haplotype frequencies among unrelated subjects decreases as LD increases to its maximum value, the general conclusions of Douglas et al. ([2001] Nat. Genet. 28:361-364) hold true for a variety of LD patterns and magnitudes. However, our results also demonstrate that trios of parents+one child are highly efficient for haplotype frequency estimation, that additional children offer little information, and that siblings without parents can be grossly inefficient. Genet. Epidemiol. 23:426-443, 2002.

摘要

单倍型对于易感基因座的精细定位以及评估沿染色体多个位点的序列变异都很有用。然而,在二倍体生物中,很难直接测量长片段DNA上的单倍型。因此,通常会测量多个遗传标记,而不考虑连锁相信息,从而得出个体的双倍型。从双倍型数据中,单倍型通常通过系谱信息推断得出,或者在估计无关个体间的单倍型频率时被视为部分缺失数据。后一种模糊性会增加估计的单倍型频率的方差。道格拉斯等人([2001]《自然遗传学》28:361 - 364)最近量化了从无关个体的双倍型估计单倍型频率相对于通过体细胞杂种(转换技术)直接测量单倍型的相对效率,并证明未知的连锁相可能导致效率大幅损失。然而,他们的结果是基于标记位点之间的连锁平衡,而对于紧密连锁的标记来说这可能并不现实。我们从几个方面扩展了他们的相对效率计算:1)考虑标记位点之间的连锁不平衡(LD);2)评估不同的LD模式;3)评估有父母和没有父母的核心家庭。我们表明,尽管随着LD增加到其最大值,无关个体间单倍型频率估计效率的损失会降低,但道格拉斯等人([2001]《自然遗传学》28:361 - 364)的总体结论对于各种LD模式和程度都成立。然而,我们的结果也表明,父母 + 一个孩子的三人组合对于单倍型频率估计非常高效,额外的孩子提供的信息很少,而没有父母的兄弟姐妹可能效率极低《遗传流行病学》23:426 - 443,2002年。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验