Li Meng, Liu Xiaolei, Bradbury Peter, Yu Jianming, Zhang Yuan-Ming, Todhunter Rory J, Buckler Edward S, Zhang Zhiwu
Institute for Genomic Diversity, Cornell University, Ithaca 14853, New York, USA.
BMC Biol. 2014 Oct 17;12:73. doi: 10.1186/s12915-014-0073-5.
The inheritance of most human diseases and agriculturally important traits is controlled by many genes with small effects. Identifying these genes, while simultaneously controlling false positives, is challenging. Among available statistical methods, the mixed linear model (MLM) has been the most flexible and powerful for controlling population structure and individual unequal relatedness (kinship), the two common causes of spurious associations. The introduction of the compressed MLM (CMLM) method provided additional opportunities for optimization by adding two new model parameters: grouping algorithms and number of groups.
This study introduces another model parameter to develop an enriched CMLM (ECMLM). The parameter involves algorithms to define kinship between groups (that is, kinship algorithms). The ECMLM calculates kinship using several different algorithms and then chooses the best combination between kinship algorithms and grouping algorithms.
Simulations show that the ECMLM increases statistical power. In some cases, the magnitude of power gained by using ECMLM instead of CMLM is larger than the improvement found by using CMLM instead of MLM.
大多数人类疾病和农业重要性状的遗传由许多效应较小的基因控制。识别这些基因,同时控制假阳性,具有挑战性。在现有的统计方法中,混合线性模型(MLM)在控制群体结构和个体不等亲缘关系(亲属关系)这两个导致虚假关联的常见原因方面最为灵活且强大。压缩混合线性模型(CMLM)方法的引入通过添加两个新的模型参数:分组算法和组数,提供了更多优化机会。
本研究引入另一个模型参数来开发富集压缩混合线性模型(ECMLM)。该参数涉及定义组间亲属关系的算法(即亲属关系算法)。ECMLM使用几种不同算法计算亲属关系,然后在亲属关系算法和分组算法之间选择最佳组合。
模拟表明ECMLM提高了统计功效。在某些情况下,使用ECMLM而非CMLM所获得的功效提升幅度大于使用CMLM而非MLM所发现的改进幅度。