Suppr超能文献

基于基本合并模型的缺失基因型推断及其扩展:考虑群体增长和结构。

Imputation of missing genotypes within LD-blocks relying on the basic coalescent and beyond: consideration of population growth and structure.

机构信息

Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany.

Institute of Medical Biometry and Informatics, University of Heidelberg, 69120, Heidelberg, Germany.

出版信息

BMC Genomics. 2017 Oct 17;18(1):798. doi: 10.1186/s12864-017-4208-2.

Abstract

BACKGROUND

Genotypes not directly measured in genetic studies are often imputed to improve statistical power and to increase mapping resolution. The accuracy of standard imputation techniques strongly depends on the similarity of linkage disequilibrium (LD) patterns in the study and reference populations. Here we develop a novel approach for genotype imputation in low-recombination regions that relies on the coalescent and permits to explicitly account for population demographic factors. To test the new method, study and reference haplotypes were simulated and gene trees were inferred under the basic coalescent and also considering population growth and structure. The reference haplotypes that first coalesced with study haplotypes were used as templates for genotype imputation. Computer simulations were complemented with the analysis of real data. Genotype concordance rates were used to compare the accuracies of coalescent-based and standard (IMPUTE2) imputation.

RESULTS

Simulations revealed that, in LD-blocks, imputation accuracy relying on the basic coalescent was higher and less variable than with IMPUTE2. Explicit consideration of population growth and structure, even if present, did not practically improve accuracy. The advantage of coalescent-based over standard imputation increased with the minor allele frequency and it decreased with population stratification. Results based on real data indicated that, even in low-recombination regions, further research is needed to incorporate recombination in coalescence inference, in particular for studies with genetically diverse and admixed individuals.

CONCLUSIONS

To exploit the full potential of coalescent-based methods for the imputation of missing genotypes in genetic studies, further methodological research is needed to reduce computer time, to take into account recombination, and to implement these methods in user-friendly computer programs. Here we provide reproducible code which takes advantage of publicly available software to facilitate further developments in the field.

摘要

背景

在遗传研究中,通常会对无法直接测量的基因型进行推断,以提高统计效力并增加图谱分辨率。标准推断技术的准确性强烈依赖于研究和参考群体中连锁不平衡(LD)模式的相似性。在这里,我们开发了一种新的低重组区域基因型推断方法,该方法依赖于合并,并允许明确考虑人口统计因素。为了测试新方法,我们模拟了研究和参考单倍型,并在基本合并模型下推断基因树,同时还考虑了种群增长和结构。首先与研究单倍型合并的参考单倍型被用作基因型推断的模板。计算机模拟与真实数据的分析相辅相成。使用基因型一致性率来比较基于合并和标准(IMPUTE2)推断的准确性。

结果

模拟结果表明,在 LD 块中,基于基本合并的推断准确性更高,且变异性更小,优于 IMPUTE2。即使存在种群增长和结构的明确考虑,实际上也不会提高准确性。基于合并的推断相对于标准推断的优势随着次要等位基因频率的增加而增加,随着群体分层的增加而减小。基于真实数据的结果表明,即使在低重组区域,也需要进一步研究以在合并推断中纳入重组,特别是对于具有遗传多样性和混合个体的研究。

结论

为了充分发挥基于合并的方法在遗传研究中推断缺失基因型的潜力,需要进一步进行方法学研究,以减少计算机时间,考虑重组,并将这些方法实施到用户友好的计算机程序中。在这里,我们提供了可重复的代码,利用了公开可用的软件,以促进该领域的进一步发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f78/5646149/17bed42ae289/12864_2017_4208_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验