Department of Animal Science, Iowa State University, Ames, IA, USA.
Genet Sel Evol. 2010 Jun 11;42(1):21. doi: 10.1186/1297-9686-42-21.
Information for mapping of quantitative trait loci (QTL) comes from two sources: linkage disequilibrium (non-random association of allele states) and cosegregation (non-random association of allele origin). Information from LD can be captured by modeling conditional means and variances at the QTL given marker information. Similarly, information from cosegregation can be captured by modeling conditional covariances. Here, we consider a Bayesian model based on gene frequency (BGF) where both conditional means and variances are modeled as a function of the conditional gene frequencies at the QTL. The parameters in this model include these gene frequencies, additive effect of the QTL, its location, and the residual variance. Bayesian methodology was used to estimate these parameters. The priors used were: logit-normal for gene frequencies, normal for the additive effect, uniform for location, and inverse chi-square for the residual variance. Computer simulation was used to compare the power to detect and accuracy to map QTL by this method with those from least squares analysis using a regression model (LSR).
To simplify the analysis, data from unrelated individuals in a purebred population were simulated, where only LD information contributes to map the QTL. LD was simulated in a chromosomal segment of 1 cM with one QTL by random mating in a population of size 500 for 1000 generations and in a population of size 100 for 50 generations. The comparison was studied under a range of conditions, which included SNP density of 0.1, 0.05 or 0.02 cM, sample size of 500 or 1000, and phenotypic variance explained by QTL of 2 or 5%. Both 1 and 2-SNP models were considered. Power to detect the QTL for the BGF, ranged from 0.4 to 0.99, and close or equal to the power of the regression using least squares (LSR). Precision to map QTL position of BGF, quantified by the mean absolute error, ranged from 0.11 to 0.21 cM for BGF, and was better than the precision of LSR, which ranged from 0.12 to 0.25 cM.
In conclusion given a high SNP density, the gene frequency model can be used to map QTL with considerable accuracy even within a 1 cM region.
定量性状基因座(QTL)定位的信息来自两个来源:连锁不平衡(等位状态的非随机关联)和共分离(等位起源的非随机关联)。LD 信息可以通过在给定标记信息的情况下对 QTL 的条件均值和方差进行建模来捕获。同样,共分离信息可以通过对条件协方差进行建模来捕获。在这里,我们考虑了一种基于基因频率的贝叶斯模型(BGF),其中条件均值和方差都被建模为 QTL 条件基因频率的函数。该模型中的参数包括这些基因频率、QTL 的加性效应、位置和剩余方差。贝叶斯方法用于估计这些参数。使用的先验分布为:基因频率的对数正态分布、加性效应的正态分布、位置的均匀分布和剩余方差的逆卡方分布。计算机模拟用于比较使用基因频率模型(BGF)和最小二乘分析(LSR)的回归模型检测和定位 QTL 的能力。
为了简化分析,模拟了纯种群体中无关个体的数据,其中只有 LD 信息有助于定位 QTL。通过在大小为 500 的群体中进行 1000 代的随机交配,在大小为 100 的群体中进行 50 代的随机交配,在 1 cM 的染色体片段中模拟了 LD。在一系列条件下研究了比较,其中包括 0.1、0.05 或 0.02 cM 的 SNP 密度、500 或 1000 的样本量以及 QTL 解释的表型方差 2 或 5%。考虑了 1 和 2-SNP 模型。BGF 检测 QTL 的能力范围为 0.4 到 0.99,与使用最小二乘法(LSR)的回归能力接近或相等。BGF 定位 QTL 位置的精度,以平均绝对误差衡量,范围为 0.11 到 0.21 cM,优于 LSR 的精度,范围为 0.12 到 0.25 cM。
总之,在 SNP 密度较高的情况下,即使在 1 cM 区域内,基因频率模型也可以用于非常准确地定位 QTL。