Sugaya Yuki
School of Fundamental Science and Technology, Keio University, Yokohama, Japan.
BMC Res Notes. 2012 Aug 28;5:465. doi: 10.1186/1756-0500-5-465.
Linkage analysis is a useful tool for detecting genetic variants that regulate a trait of interest, especially genes associated with a given disease. Although penetrance parameters play an important role in determining gene location, they are assigned arbitrary values according to the researcher's intuition or as estimated by the maximum likelihood principle. Several methods exist by which to evaluate the maximum likelihood estimates of penetrance, although not all of these are supported by software packages and some are biased by marker genotype information, even when disease development is due solely to the genotype of a single allele.
Programs for exploring the maximum likelihood estimates of penetrance parameters were developed using the R statistical programming language supplemented by external C functions. The software returns a vector of polynomial coefficients of penetrance parameters, representing the likelihood of pedigree data. From the likelihood polynomial supplied by the proposed method, the likelihood value and its gradient can be precisely computed. To reduce the effect of the supplied dataset on the likelihood function, feasible parameter constraints can be introduced into maximum likelihood estimates, thus enabling flexible exploration of the penetrance estimates. An auxiliary program generates a perspective plot allowing visual validation of the model's convergence. The functions are collectively available as the MLEP R package.
Linkage analysis using penetrance parameters estimated by the MLEP package enables feasible localization of a disease locus. This is shown through a simulation study and by demonstrating how the package is used to explore maximum likelihood estimates. Although the input dataset tends to bias the likelihood estimates, the method yields accurate results superior to the analysis using intuitive penetrance values for disease with low allele frequencies. MLEP is part of the Comprehensive R Archive Network and is freely available at http://cran.r-project.org/web/packages/MLEP/index.html.
连锁分析是检测调控感兴趣性状的遗传变异的有用工具,尤其是与特定疾病相关的基因。虽然外显率参数在确定基因位置方面起着重要作用,但它们是根据研究人员的直觉或通过最大似然原理估计而被赋予任意值。存在几种评估外显率最大似然估计值的方法,不过并非所有这些方法都得到软件包的支持,并且有些方法会受到标记基因型信息的偏差影响,即使疾病的发展仅由单个等位基因的基因型导致。
使用由外部C函数补充的R统计编程语言开发了用于探索外显率参数最大似然估计值的程序。该软件返回外显率参数的多项式系数向量,代表系谱数据的似然性。从所提出方法提供的似然多项式中,可以精确计算似然值及其梯度。为了减少所提供数据集对似然函数的影响,可以将可行的参数约束引入最大似然估计中,从而能够灵活地探索外显率估计值。一个辅助程序会生成一个透视图,允许对模型的收敛进行可视化验证。这些函数作为MLEP R包共同可用。
使用由MLEP包估计的外显率参数进行连锁分析能够实现疾病位点的可行定位。这通过模拟研究以及展示该包如何用于探索最大似然估计得以证明。尽管输入数据集往往会使似然估计产生偏差,但该方法对于等位基因频率较低的疾病所产生的准确结果优于使用直观外显率值进行的分析。MLEP是综合R存档网络的一部分,可在http://cran.r-project.org/web/packages/MLEP/index.html免费获取。