Suppr超能文献

一种用于定位二元疾病基因座的期望最大化(EM)算法:在四向杂交小鼠家族纤维肉瘤中的应用。

An EM algorithm for mapping binary disease loci: application to fibrosarcoma in a four-way cross mouse family.

作者信息

Xu Shizhong, Yi Nengjun, Burke David, Galecki Andrzej, Miller Richard A

机构信息

Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA.

出版信息

Genet Res. 2003 Oct;82(2):127-38. doi: 10.1017/s0016672303006414.

Abstract

Many diseases show dichotomous phenotypic variation but do not follow a simple Mendelian pattern of inheritance. Variances of these binary diseases are presumably controlled by multiple loci and environmental variants. A least-squares method has been developed for mapping such complex disease loci by treating the binary phenotypes (0 and 1) as if they were continuous. However, the least-squares method is not recommended because of its ad hoc nature. Maximum Likelihood (ML) and Bayesian methods have also been developed for binary disease mapping by incorporating the discrete nature of the phenotypic distribution. In the ML analysis, the likelihood function is usually maximized using some complicated maximization algorithms (e.g. the Newton-Raphson or the simplex algorithm). Under the threshold model of binary disease, we develop an Expectation Maximization (EM) algorithm to solve for the maximum likelihood estimates (MLEs). The new EM algorithm is developed by treating both the unobserved genotype and the disease liability as missing values. As a result, the EM iteration equations have the same form as the normal equation system in linear regression. The EM algorithm is further modified to take into account sexual dimorphism in the linkage maps. Applying the EM-implemented ML method to a four-way-cross mouse family, we detected two regions on the fourth chromosome that have evidence of QTLs controlling the segregation of fibrosarcoma, a form of connective tissue cancer. The two QTLs explain 50-60% of the variance in the disease liability. We also applied a Bayesian method previously developed (modified to take into account sex-specific maps) to this data set and detected one additional QTL on chromosome 13 that explains another 26% of the variance of the disease liability. All the QTLs detected primarily show dominance effects.

摘要

许多疾病呈现二分法的表型变异,但并不遵循简单的孟德尔遗传模式。这些二元疾病的变异可能由多个基因座和环境变异控制。已经开发了一种最小二乘法,通过将二元表型(0和1)视为连续变量来绘制此类复杂疾病基因座。然而,由于其特设性质,不建议使用最小二乘法。还开发了最大似然(ML)和贝叶斯方法,通过纳入表型分布的离散性质来进行二元疾病定位。在ML分析中,似然函数通常使用一些复杂的最大化算法(例如牛顿-拉夫森算法或单纯形算法)来最大化。在二元疾病的阈值模型下,我们开发了一种期望最大化(EM)算法来求解最大似然估计(MLE)。新的EM算法是通过将未观察到的基因型和疾病易感性都视为缺失值而开发的。结果,EM迭代方程与线性回归中的正规方程组具有相同的形式。进一步修改EM算法以考虑连锁图谱中的性别二态性。将基于EM的ML方法应用于一个四元杂交小鼠家族,我们在第四条染色体上检测到两个区域,有证据表明存在控制纤维肉瘤(一种结缔组织癌)分离的QTL。这两个QTL解释了疾病易感性中50%-60%的变异。我们还将先前开发的一种贝叶斯方法(修改后考虑了性别特异性图谱)应用于该数据集,并在13号染色体上检测到另一个QTL,它解释了疾病易感性变异的另外26%。检测到的所有QTL主要表现出显性效应。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验