• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

针对存在基因分型错误的群体数据的单倍型推断

Haplotype inference for population data with genotyping errors.

作者信息

Zhu Wensheng, Kuk Anthony Y C, Guo Jianhua

机构信息

Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, P. R. China.

出版信息

Biom J. 2009 Aug;51(4):644-58. doi: 10.1002/bimj.200800215.

DOI:10.1002/bimj.200800215
PMID:19688759
Abstract

Inference of haplotypes is important in genetic epidemiology studies. However, all large genotype data sets have errors due to the use of inexpensive genotyping machines that are fallible and shortcomings in genotyping scoring softwares, which can have an enormous impact on haplotype inference. In this article, we propose two novel strategies to reduce the impact induced by genotyping errors in haplotype inference. The first method makes use of double sampling. For each individual, the "GenoSpectrum" that consists of all possible genotypes and their corresponding likelihoods are computed. The second method is a genotype clustering algorithm based on multi-genotyping data, which also assigns a "GenoSpectrum" for each individual. We then describe two hybrid EM algorithms (called DS-EM and MG-EM) that perform haplotype inference based on "GenoSpectrum" of each individual obtained by double sampling and multi-genotyping data. Both simulated data sets and a quasi real-data set demonstrate that our proposed methods perform well in different situations and outperform the conventional EM algorithm and the HMM algorithm proposed by Sun, Greenwood, and Neal (2007, Genetic Epidemiology 31, 937-948) when the genotype data sets have errors.

摘要

单倍型推断在遗传流行病学研究中很重要。然而,由于使用了易出错的廉价基因分型机器以及基因分型评分软件的缺陷,所有大型基因型数据集都存在错误,这可能对单倍型推断产生巨大影响。在本文中,我们提出了两种新策略来减少基因分型错误在单倍型推断中所产生的影响。第一种方法利用双重抽样。对于每个个体,计算由所有可能基因型及其相应似然性组成的“基因谱”。第二种方法是基于多基因分型数据的基因型聚类算法,它也为每个个体分配一个“基因谱”。然后,我们描述了两种混合期望最大化算法(称为DS-EM和MG-EM),它们基于通过双重抽样和多基因分型数据获得的每个个体的“基因谱”进行单倍型推断。模拟数据集和一个准真实数据集均表明,我们提出的方法在不同情况下表现良好,并且当基因型数据集存在错误时,优于传统的期望最大化算法以及Sun、Greenwood和Neal(2007年,《遗传流行病学》31卷,937 - 948页)提出的隐马尔可夫模型算法。

相似文献

1
Haplotype inference for population data with genotyping errors.针对存在基因分型错误的群体数据的单倍型推断
Biom J. 2009 Aug;51(4):644-58. doi: 10.1002/bimj.200800215.
2
Algorithms for inferring haplotypes.单倍型推断算法。
Genet Epidemiol. 2004 Dec;27(4):334-47. doi: 10.1002/gepi.20024.
3
Incorporating genotyping uncertainty in haplotype frequency estimation in pedigree studies.在系谱研究中,将基因分型不确定性纳入单倍型频率估计。
Hum Hered. 2007;64(3):172-81. doi: 10.1159/000102990. Epub 2007 May 25.
4
Estimating haplotype frequencies and standard errors for multiple single nucleotide polymorphisms.估计多个单核苷酸多态性的单倍型频率和标准误差。
Biostatistics. 2003 Oct;4(4):513-22. doi: 10.1093/biostatistics/4.4.513.
5
HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination.HAPLORE:一个用于在无重组的一般家系中进行单倍型重建的程序。
Bioinformatics. 2005 Jan 1;21(1):90-103. doi: 10.1093/bioinformatics/bth388. Epub 2004 Jul 1.
6
Inference of missing SNPs and information quantity measurements for haplotype blocks.单倍型块中缺失单核苷酸多态性的推断及信息量测量
Bioinformatics. 2005 May 1;21(9):2001-7. doi: 10.1093/bioinformatics/bti261. Epub 2005 Feb 4.
7
Accounting for genotyping errors in tagging SNP selection.在标签单核苷酸多态性选择中考虑基因分型错误。
Ann Hum Genet. 2007 Jul;71(Pt 4):467-79. doi: 10.1111/j.1469-1809.2007.00354.x. Epub 2007 Mar 7.
8
PoooL: an efficient method for estimating haplotype frequencies from large DNA pools.PoooL:一种从大型DNA混合样本中估计单倍型频率的有效方法。
Bioinformatics. 2008 Sep 1;24(17):1942-8. doi: 10.1093/bioinformatics/btn324. Epub 2008 Jun 23.
9
Incorporating genotyping uncertainty in haplotype inference for single-nucleotide polymorphisms.在单核苷酸多态性的单倍型推断中纳入基因分型不确定性。
Am J Hum Genet. 2004 Mar;74(3):495-510. doi: 10.1086/382284. Epub 2004 Feb 13.
10
Identification of probable genotyping errors by consideration of haplotypes.通过考虑单倍型来识别可能的基因分型错误。
Eur J Hum Genet. 2006 Apr;14(4):450-8. doi: 10.1038/sj.ejhg.5201565.