Suppr超能文献

PRIMAL:在创始人群体中基于系谱从序列数据进行快速准确的填充。

PRIMAL: Fast and accurate pedigree-based imputation from sequence data in a founder population.

作者信息

Livne Oren E, Han Lide, Alkorta-Aranburu Gorka, Wentworth-Sheilds William, Abney Mark, Ober Carole, Nicolae Dan L

机构信息

Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America.

Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America; Departments of Medicine, and Statistics, The University of Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS Comput Biol. 2015 Mar 3;11(3):e1004139. doi: 10.1371/journal.pcbi.1004139. eCollection 2015 Mar.

Abstract

Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.

摘要

奠基者群体和大型家系为基因定位研究提供了许多众所周知的优势,包括成本效益高的研究设计。在这里,我们描述了PRIMAL(家系归因算法),一种用于奠基者群体的基于家系的快速且准确的定相和归因算法。PRIMAL融合了现有和原创的想法,例如基于团图的一种新颖的同源片段(IBD)索引策略。我们能够从98个全基因组序列中,对1317名南达科他州哈特派信徒的基因组进行归因,这些信徒拥有约30万个常见单核苷酸变异(SNV)的全基因组基因型。通过结合基于家系和基于连锁不平衡(LD)的归因方法,我们能够在整个等位基因频率范围内,以>99%的准确率分配87%的基因型。利用IBD团,我们还能够推断出83%等位基因的亲本来源,以及那些没有基因型信息的已故近期祖先的基因型。这个归因数据集将使我们能够以最低成本更好地研究罕见和常见变异对人类表型的相对贡献,以及疾病风险等位基因在1000多名个体中的亲本来源效应。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验