Suppr超能文献

高效精确的贝叶斯 SNP 基因型多倍体最大后验计算。

Efficient exact maximum a posteriori computation for bayesian SNP genotyping in polyploids.

机构信息

Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America.

出版信息

PLoS One. 2012;7(2):e30906. doi: 10.1371/journal.pone.0030906. Epub 2012 Feb 17.

Abstract

The problem of genotyping polyploids is extremely important for the creation of genetic maps and assembly of complex plant genomes. Despite its significance, polyploid genotyping still remains largely unsolved and suffers from a lack of statistical formality. In this paper a graphical bayesian model for SNP genotyping data is introduced. This model can infer genotypes even when the ploidy of the population is unknown. We also introduce an algorithm for finding the exact maximum a posteriori genotype configuration with this model. This algorithm is implemented in a freely available web-based software package SuperMASSA. We demonstrate the utility, efficiency, and flexibility of the model and algorithm by applying them to two different platforms, each of which is applied to a polyploid data set: Illumina GoldenGate data from potato and Sequenom MassARRAY data from sugarcane. Our method achieves state-of-the-art performance on both data sets and can be trivially adapted to use models that utilize prior information about any platform or species.

摘要

多倍体基因分型问题对于构建遗传图谱和组装复杂植物基因组至关重要。尽管其意义重大,但多倍体基因分型仍然在很大程度上尚未得到解决,并且缺乏统计形式。本文介绍了一种用于 SNP 基因分型数据的图形贝叶斯模型。该模型即使在未知群体倍性的情况下也可以推断基因型。我们还介绍了一种使用该模型找到精确最大后验基因型配置的算法。该算法在一个免费提供的基于网络的软件包 SuperMASSA 中实现。我们通过将其应用于两个不同的平台来证明模型和算法的实用性、效率和灵活性,每个平台都应用于一个多倍体数据集:来自马铃薯的 Illumina GoldenGate 数据和来自甘蔗的 Sequenom MassARRAY 数据。我们的方法在两个数据集上都达到了最先进的性能,并且可以轻而易举地适应使用任何平台或物种的先验信息的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/331c/3281906/67bf75716c7f/pone.0030906.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验