Suppr超能文献

单倍型分析中信息性缺失基因型的建模

Modeling Informatively Missing Genotypes in Haplotype Analysis.

作者信息

Liu Nianjun, Bucala Richard, Zhao Hongyu

机构信息

Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL.

出版信息

Commun Stat Theory Methods. 2009;38(18):3445-3460. doi: 10.1080/03610920802696588.

Abstract

It is common to have missing genotypes in practical genetic studies. The majority of the existing statistical methods, including those on haplotype analysis, assume that genotypes are missing at random-that is, at a given marker, different genotypes and different alleles are missing with the same probability. In our previous work, we have demonstrated that the violation of this assumption may lead to serious bias in haplotype frequency estimates and haplotype association analysis. We have proposed a general missing data model to simultaneously characterize missing data patterns across a set of two or more biallelic markers. We have proved that haplotype frequencies and missing data probabilities are identifiable if and only if there is linkage disequilibrium between these markers under the general missing data model. In this study, we extend our work to multi-allelic markers and observe a similar finding. Simulation studies on the analysis of haplotypes consisting of two markers illustrate that our proposed model can reduce the bias for haplotype frequency estimates due to incorrect assumptions on the missing data mechanism. Finally, we illustrate the utilities of our method through its application to a real data set from a study of scleroderma.

摘要

在实际的遗传学研究中,基因型缺失的情况很常见。现有的大多数统计方法,包括那些用于单倍型分析的方法,都假定基因型是随机缺失的——也就是说,在给定的标记位点,不同的基因型和不同的等位基因以相同的概率缺失。在我们之前的工作中,我们已经证明,违反这一假设可能会导致单倍型频率估计和单倍型关联分析中出现严重偏差。我们提出了一个通用的缺失数据模型,以同时刻画一组两个或更多双等位基因标记位点上的缺失数据模式。我们已经证明,在通用缺失数据模型下,当且仅当这些标记位点之间存在连锁不平衡时,单倍型频率和缺失数据概率才是可识别的。在本研究中,我们将工作扩展到多等位基因标记位点,并观察到了类似的结果。对由两个标记位点组成的单倍型进行分析的模拟研究表明,我们提出的模型可以减少由于对缺失数据机制的错误假设而导致的单倍型频率估计偏差。最后,我们通过将我们的方法应用于一项硬皮病研究的真实数据集来说明其效用。

相似文献

1
Modeling Informatively Missing Genotypes in Haplotype Analysis.单倍型分析中信息性缺失基因型的建模
Commun Stat Theory Methods. 2009;38(18):3445-3460. doi: 10.1080/03610920802696588.

本文引用的文献

1
Methods to impute missing genotypes for population data.用于推算群体数据中缺失基因型的方法。
Hum Genet. 2007 Dec;122(5):495-504. doi: 10.1007/s00439-007-0427-y. Epub 2007 Sep 13.
7
GEL: a novel genotype calling algorithm using empirical likelihood.GEL:一种使用经验似然的新型基因型分型算法。
Bioinformatics. 2006 Aug 15;22(16):1942-7. doi: 10.1093/bioinformatics/btl341. Epub 2006 Jun 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验