Suppr超能文献

运用面向对象的贝叶斯网络对 STR 标记之间的连锁、连锁不平衡和突变进行建模。

Using object oriented bayesian networks to model linkage, linkage disequilibrium and mutations between STR markers.

机构信息

Department of Family Genetics, Norwegian Institute of Public Health, Oslo, Norway.

出版信息

PLoS One. 2012;7(9):e43873. doi: 10.1371/journal.pone.0043873. Epub 2012 Sep 11.

Abstract

In a number of applications there is a need to determine the most likely pedigree for a group of persons based on genetic markers. Adequate models are needed to reach this goal. The markers used to perform the statistical calculations can be linked and there may also be linkage disequilibrium (LD) in the population. The purpose of this paper is to present a graphical Bayesian Network framework to deal with such data. Potential LD is normally ignored and it is important to verify that the resulting calculations are not biased. Even if linkage does not influence results for regular paternity cases, it may have substantial impact on likelihood ratios involving other, more extended pedigrees. Models for LD influence likelihoods for all pedigrees to some degree and an initial estimate of the impact of ignoring LD and/or linkage is desirable, going beyond mere rules of thumb based on marker distance. Furthermore, we show how one can readily include a mutation model in the Bayesian Network; extending other programs or formulas to include such models may require considerable amounts of work and will in many case not be practical. As an example, we consider the two STR markers vWa and D12S391. We estimate probabilities for population haplotypes to account for LD using a method based on data from trios, while an estimate for the degree of linkage is taken from the literature. The results show that accounting for haplotype frequencies is unnecessary in most cases for this specific pair of markers. When doing calculations on regular paternity cases, the markers can be considered statistically independent. In more complex cases of disputed relatedness, for instance cases involving siblings or so-called deficient cases, or when small differences in the LR matter, independence should not be assumed. (The networks are freely available at http://arken.umb.no/~dakl/BayesianNetworks.).

摘要

在许多应用中,需要根据遗传标记确定一群人最可能的谱系。需要充分的模型来达到这一目标。用于进行统计计算的标记可以是相关的,并且在人群中可能存在连锁不平衡(LD)。本文的目的是提出一个图形贝叶斯网络框架来处理这种数据。通常忽略潜在的 LD,并且验证由此产生的计算没有偏差是很重要的。即使连锁不会影响常规亲子案例的结果,它也可能对涉及其他更扩展谱系的似然比产生实质性影响。LD 模型会在某种程度上影响所有谱系的似然率,并且需要对忽略 LD 和/或连锁的影响进行初步估计,而不仅仅是基于标记距离的经验法则。此外,我们展示了如何在贝叶斯网络中轻松包含突变模型;将其他程序或公式扩展到包括此类模型可能需要大量工作,并且在许多情况下是不切实际的。作为一个例子,我们考虑了两个 STR 标记 vWa 和 D12S391。我们使用基于三胞胎数据的方法来估计群体单倍型的概率,以考虑 LD,而连锁程度的估计则来自文献。结果表明,在大多数情况下,对于这对特定的标记,考虑单倍型频率是不必要的。当对常规亲子案例进行计算时,可以认为标记在统计上是独立的。在更复杂的亲属关系争议案例中,例如涉及兄弟姐妹或所谓的缺陷案例,或者当 LR 差异较小时,不应假设独立性。(网络可在 http://arken.umb.no/~dakl/BayesianNetworks/. 上免费获得。)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/585a/3439468/377337e85d47/pone.0043873.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验