Suppr超能文献

患者群体中的单倍型频率估计:偏离哈迪-温伯格比例以及HLA区域中一个基因座上的合并的影响。

Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region.

作者信息

Single Richard M, Meyer Diogo, Hollenbach Jill A, Nelson Mark P, Noble Janelle A, Erlich Henry A, Thomson Glenys

机构信息

Department of Integrative Biology, University of California, Berkeley, USA.

出版信息

Genet Epidemiol. 2002 Feb;22(2):186-95. doi: 10.1002/gepi.0163.

Abstract

Haplotype analyses are an important area in the study of the genetic components of human disease. Associations between markers and disease loci that are not evident with a single marker locus may be identified in multi-locus marker analyses using estimated haplotype frequencies (HFs). Procedures that make use of the expectation-maximization (EM) algorithm to estimate HFs from unphased genotype data are in common use in genetic studies. The EM algorithm uses these unphased genotype frequencies along with the assumption of Hardy-Weinberg proportions (HWP) to converge on HF estimates. In this paper, we assess the accuracy of EM estimates of HFs in patients with type I diabetes for whom the true haplotypes are known, but the data are analyzed ignoring family information to allow comparison between estimated and true frequencies. The data consist of six HLA loci with high levels of polymorphism and a range of departures from HWP and linkage equilibrium. While the overall accuracy of the EM estimates is good, there can be large over- and underestimates of particular HFs, even for common haplotypes, especially when the loci involved deviate significantly from HWP. Estimating HFs for three or more loci and then collapsing over loci so as to generate two locus haplotypes can improve the accuracy of the estimation. The collapsing procedure is most beneficial when one of the loci in the two-locus haplotype of interest deviates significantly from HWP and the locus collapsed over is in linkage disequilibrium with the other loci.

摘要

单倍型分析是人类疾病遗传成分研究中的一个重要领域。在使用估计的单倍型频率(HF)进行多位点标记分析时,可能会发现单个标记位点不明显的标记与疾病位点之间的关联。利用期望最大化(EM)算法从未分型的基因型数据估计HF的方法在遗传学研究中普遍使用。EM算法使用这些未分型的基因型频率以及哈迪-温伯格平衡比例(HWP)的假设来收敛于HF估计值。在本文中,我们评估了已知真实单倍型但在分析数据时忽略家族信息以比较估计频率和真实频率的1型糖尿病患者中HF的EM估计准确性。数据由六个具有高度多态性且偏离HWP和连锁平衡范围的HLA位点组成。虽然EM估计的总体准确性良好,但即使对于常见单倍型,特定HF也可能存在大量高估和低估,特别是当所涉及的位点显著偏离HWP时。估计三个或更多位点的HF,然后对位点进行合并以生成双位点单倍型,可以提高估计的准确性。当感兴趣的双位点单倍型中的一个位点显著偏离HWP且合并的位点与其他位点处于连锁不平衡时,合并程序最为有益。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验