Suppr超能文献

一种使用单倍型调用拷贝数多态性的方法。

A method for calling copy number polymorphism using haplotypes.

机构信息

Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Philadelphia, PA, USA.

出版信息

Front Genet. 2013 Sep 23;4:165. doi: 10.3389/fgene.2013.00165. eCollection 2013.

Abstract

Single nucleotide polymorphism (SNP) and copy number variation (CNV) are both widespread characteristic of the human genome, but are often called separately on common genotyping platforms. To capture integrated SNP and CNV information, methods have been developed for calling allelic specific copy numbers or so called copy number polymorphism (CNP), using limited inter-marker correlation. In this paper, we proposed a haplotype-based maximum likelihood method to call CNP, which takes advantage of the valuable multi-locus linkage disequilibrium (LD) information in the population. We also developed a computationally efficient algorithm to estimate haplotype frequencies and optimize individual CNP calls iteratively, even at presence of missing data. Through simulations, we demonstrated our model is more sensitive and accurate in detecting various CNV regions, compared with commonly-used CNV calling methods including PennCNV, another hidden Markov model (HMM) using CNP, a scan statistic, segCNV, and cnvHap. Our method often performs better in the regions with higher LD, in longer CNV regions, and in common CNV than the opposite. We implemented our method on the genotypes of 90 HapMap CEU samples and 23 patients with acute lung injury (ALI). For each ALI patient the genotyping was performed twice. The CNPs from our method show good consistency and accuracy comparable to others.

摘要

单核苷酸多态性(SNP)和拷贝数变异(CNV)都是人类基因组的广泛特征,但在常见的基因分型平台上通常分别进行研究。为了获取 SNP 和 CNV 的综合信息,已经开发了一些方法来调用等位基因特异性拷贝数或所谓的拷贝数多态性(CNP),这些方法利用了有限的标记间相关性。在本文中,我们提出了一种基于单倍型的最大似然方法来调用 CNP,该方法利用了群体中宝贵的多基因座连锁不平衡(LD)信息。我们还开发了一种计算效率高的算法来估计单倍型频率,并通过迭代优化个体 CNP 调用,即使存在缺失数据。通过模拟,我们表明与常用的 CNV 调用方法(包括 PennCNV、另一种使用 CNP 的隐马尔可夫模型(HMM)、扫描统计、segCNV 和 cnvHap)相比,我们的模型在检测各种 CNV 区域方面更敏感和准确。与相反的情况相比,我们的方法在 LD 较高的区域、较长的 CNV 区域和常见的 CNV 中表现更好。我们在 90 个 HapMap CEU 样本和 23 名急性肺损伤(ALI)患者的基因型上实现了我们的方法。每位 ALI 患者进行了两次基因分型。我们方法的 CNP 显示出良好的一致性和可与他人相媲美的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/945e/3780619/7b1a539816eb/fgene-04-00165-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验