Brugger Markus, Strauch Konstantin
Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, and Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany.
Hum Hered. 2014;78(3-4):179-94. doi: 10.1159/000369065. Epub 2015 Jan 29.
As the mode of inheritance is often unknown for complex diseases, a MOD-score analysis, in which the parametric LOD score is maximized with respect to the trait-model parameters, can be a powerful approach in genetic linkage analysis. Because the calculation of the disease-locus likelihood is the most time-consuming step in a MOD-score analysis, we aimed to optimize this part of the calculation to speed up linkage analysis using the GENEHUNTER-MODSCORE software package.
Our new algorithm is based on minimizing the effective number of inheritance vectors by collapsing them into classes. To this end, the disease-locus-likelihood contribution of each inheritance vector is represented and stored in its algebraic form as a symbolic sum of products of penetrances and disease-allele frequencies. Simulations were used to assess the speedup of our new algorithm.
We were able to achieve speedups ranging from 1.94 to 11.52 compared to the original GENEHUNTER-MODSCORE version, with higher speedups for larger pedigrees. When calculating p values, the speedup ranged from 1.69 to 10.36.
Computation times for MOD-score analysis, involving the evaluation of many tested sets of trait-model parameters and p value calculation, have been prohibitively high so far. With our new algebraic algorithm, such an analysis is now feasible within a reasonable amount of time.
由于复杂疾病的遗传模式通常未知,MOD评分分析(即相对于性状模型参数最大化参数LOD评分)在基因连锁分析中可能是一种强大的方法。因为疾病位点似然性的计算是MOD评分分析中最耗时的步骤,我们旨在优化这部分计算,以使用GENEHUNTER - MODSCORE软件包加速连锁分析。
我们的新算法基于通过将遗传向量归为类来最小化其有效数量。为此,每个遗传向量对疾病位点似然性的贡献以其代数形式表示并存储为外显率和疾病等位基因频率乘积的符号和。使用模拟来评估我们新算法的加速效果。
与原始的GENEHUNTER - MODSCORE版本相比,我们能够实现1.94至11.52的加速,对于更大的家系加速效果更高。在计算p值时,加速范围为1.69至10.36。
到目前为止,涉及评估许多测试的性状模型参数集和p值计算的MOD评分分析的计算时间一直高得令人望而却步。通过我们新的代数算法,现在在合理的时间内进行这样的分析是可行的。