Suppr超能文献

利用低覆盖度序列数据推算双等位基因群体中的基因型

Imputing Genotypes in Biallelic Populations from Low-Coverage Sequence Data.

作者信息

Fragoso Christopher A, Heffelfinger Christopher, Zhao Hongyu, Dellaporta Stephen L

机构信息

Program of Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520 Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut 06520.

Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut 06520.

出版信息

Genetics. 2016 Feb;202(2):487-95. doi: 10.1534/genetics.115.182071. Epub 2015 Dec 29.

Abstract

Low-coverage next-generation sequencing methodologies are routinely employed to genotype large populations. Missing data in these populations manifest both as missing markers and markers with incomplete allele recovery. False homozygous calls at heterozygous sites resulting from incomplete allele recovery confound many existing imputation algorithms. These types of systematic errors can be minimized by incorporating depth-of-sequencing read coverage into the imputation algorithm. Accordingly, we developed Low-Coverage Biallelic Impute (LB-Impute) to resolve missing data issues. LB-Impute uses a hidden Markov model that incorporates marker read coverage to determine variable emission probabilities. Robust, highly accurate imputation results were reliably obtained with LB-Impute, even at extremely low (<1×) average per-marker coverage. This finding will have implications for the design of genotype imputation algorithms in the future. LB-Impute is publicly available on GitHub at https://github.com/dellaporta-laboratory/LB-Impute.

摘要

低覆盖度的下一代测序方法通常用于对大规模人群进行基因分型。这些人群中的缺失数据表现为缺失标记以及等位基因恢复不完整的标记。等位基因恢复不完整导致杂合位点出现错误的纯合呼叫,这使许多现有的填充算法变得复杂。通过将测序深度覆盖纳入填充算法,可以将这些类型的系统误差降至最低。因此,我们开发了低覆盖度双等位基因填充法(LB-Impute)来解决缺失数据问题。LB-Impute使用一种隐藏马尔可夫模型,该模型纳入标记读取覆盖度以确定可变发射概率。即使在极低的(<1×)平均每个标记覆盖度下,使用LB-Impute也能可靠地获得稳健、高度准确的填充结果。这一发现将对未来基因分型填充算法的设计产生影响。LB-Impute可在GitHub上公开获取,网址为https://github.com/dellaporta-laboratory/LB-Impute。

相似文献

8
A comprehensive evaluation of SNP genotype imputation.单核苷酸多态性(SNP)基因型填充的综合评估。
Hum Genet. 2009 Mar;125(2):163-71. doi: 10.1007/s00439-008-0606-5. Epub 2008 Dec 17.
9
Molgenis-impute: imputation pipeline in a box.Molgenis-impute:一体化的插补流程。
BMC Res Notes. 2015 Aug 19;8:359. doi: 10.1186/s13104-015-1309-3.

引用本文的文献

本文引用的文献

3
minimac2: faster genotype imputation.Minimac2:更快的基因型填充。
Bioinformatics. 2015 Mar 1;31(5):782-4. doi: 10.1093/bioinformatics/btu704. Epub 2014 Oct 22.
5
Genotype imputation via matrix completion.基于矩阵补全的基因型推断。
Genome Res. 2013 Mar;23(3):509-18. doi: 10.1101/gr.145821.112. Epub 2012 Dec 10.
7
MaCH-admix: genotype imputation for admixed populations.MaCH-admix:混合人群的基因型推断。
Genet Epidemiol. 2013 Jan;37(1):25-37. doi: 10.1002/gepi.21690. Epub 2012 Oct 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验