Suppr超能文献

用于从SNP数据中对多态性缺失进行评分的贝叶斯期望最大化算法及其在8q24上常见拷贝数变异中的应用

Bayesian EM algorithm for scoring polymorphic deletions from SNP data and application to a common CNV on 8q24.

作者信息

Zöllner Sebastian, Su Gang, Stewart William C L, Chen Yi, McInnis Melvin G, Burmeister Margit

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109-2029, USA.

出版信息

Genet Epidemiol. 2009 May;33(4):357-68. doi: 10.1002/gepi.20391.

Abstract

Copy number variations (CNVs) in the human genome provide exciting candidates for functional polymorphisms. Hence, we now assess association between CNV carrier status and diseases status by evaluating the signal intensity of SNP genotyping assays. Here, we present a novel statistical method designed to perform such inference and apply this method to a known CNV in a bipolar disorder linkage region. Using Bayesian computations we calculate the posterior probability for carrier status of a CNV in each individual of a sample by jointly analyzing genotype information and hybridization intensity. We model the signal intensity as a mixture of normal distributions, allowing for locus-specific and allele-specific distributions. Using an expectation maximization algorithm we estimate the parameters of these distributions and use these estimates for inferring carrier status of each individual and for the boundaries of the CNV. We applied the method to a sample of 3,512 individuals to a previously described common deletion on 8q24, a region consistently showing linkage to bipolar disorder, and unambiguously inferred 172 heterozygous and 1 homozygous deletion carrier. We observed no significant association between bipolar disorder and carrier status. We carefully assessed the validity of the inferred carrier status and observed no indication of errors. Furthermore, the algorithm precisely identifies the boundaries of the CNV. Finally, we assessed the power of this algorithm to detect shorter CNVs by sub-sampling from the SNPs covered by this deletion, demonstrating that our EM algorithm produces precise estimates of carrier status.

摘要

人类基因组中的拷贝数变异(CNV)为功能性多态性提供了令人兴奋的候选对象。因此,我们现在通过评估单核苷酸多态性(SNP)基因分型检测的信号强度来评估CNV携带者状态与疾病状态之间的关联。在此,我们提出一种新颖的统计方法来进行此类推断,并将该方法应用于双相情感障碍连锁区域中的一个已知CNV。通过联合分析基因型信息和杂交强度,我们使用贝叶斯计算来计算样本中每个个体的CNV携带者状态的后验概率。我们将信号强度建模为正态分布的混合,允许位点特异性和等位基因特异性分布。使用期望最大化算法,我们估计这些分布的参数,并将这些估计用于推断每个个体的携带者状态以及CNV的边界。我们将该方法应用于3512名个体的样本,针对先前描述的8q24上的常见缺失,该区域一直显示与双相情感障碍存在连锁关系,并明确推断出172名杂合缺失携带者和1名纯合缺失携带者。我们未观察到双相情感障碍与携带者状态之间存在显著关联。我们仔细评估了推断的携带者状态的有效性,未发现错误迹象。此外,该算法精确地识别出CNV的边界。最后,我们通过对该缺失所覆盖的SNP进行子采样来评估该算法检测较短CNV的能力,表明我们的期望最大化算法能够精确估计携带者状态。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验