Xu S, Vogl C
Department of Botany and Plant Sciences, University of California, Riverside 92521, USA.
Heredity (Edinb). 2000 May;84 ( Pt 5):525-37. doi: 10.1046/j.1365-2540.2000.00653.x.
Selective genotyping is a cost-saving strategy in mapping quantitative trait loci (QTLs). When the proportion of individuals selected for genotyping is low, the majority of the individuals are not genotyped, but their phenotypic values, if available, are still included in the data analysis to correct the bias in parameter estimation. These ungenotyped individuals do not contribute much information about linkage analysis and their inclusion can substantially increase the computational burden. For multiple trait analysis, ungenotyped individuals may not have a full array of phenotypic measurements. In this case, unbiased estimation of QTL effects using current methods seems to be impossible. In this study, we develop a maximum likelihood method of QTL mapping under selective genotyping using only the phenotypic values of genotyped individuals. Compared with the full data analysis (using all phenotypic values), the proposed method performs well. We derive an expectation-maximization (EM) algorithm that appears to be a simple modification of the existing EM algorithm for standard interval mapping. The new method can be readily incorporated into a standard QTL mapping software, e.g. MAPMAKER. A general recommendation is that whenever full data analysis is possible, the full maximum likelihood analysis should be performed. If it is impossible to analyse the full data, e.g. sample sizes are too large, phenotypic values of ungenotyped individuals are missing or composite interval mapping is to be performed, the proposed method can be applied.
选择性基因分型是一种在定位数量性状基因座(QTL)时节省成本的策略。当选择进行基因分型的个体比例较低时,大多数个体未进行基因分型,但如果有其表型值,仍会将其纳入数据分析以校正参数估计中的偏差。这些未基因分型的个体对连锁分析贡献的信息不多,将它们纳入会大幅增加计算负担。对于多性状分析,未基因分型的个体可能没有完整的表型测量数据。在这种情况下,使用当前方法似乎无法无偏估计QTL效应。在本研究中,我们开发了一种仅使用已基因分型个体的表型值在选择性基因分型下进行QTL定位的最大似然方法。与全数据分析(使用所有表型值)相比,该方法表现良好。我们推导了一种期望最大化(EM)算法,它似乎是对现有用于标准区间定位的EM算法的简单修改。新方法可轻松整合到标准QTL定位软件中,例如MAPMAKER。一般建议是,只要有可能进行全数据分析,就应进行全最大似然分析。如果无法分析全数据,例如样本量过大、未基因分型个体的表型值缺失或要进行复合区间定位,就可以应用所提出的方法。