Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.
Unitat de Genòmica de Malalties Complexes, Institut d'Investigació Biomèdica Sant Pau (IIB-Sant Pau), Barcelona, Spain.
BMC Bioinformatics. 2018 Feb 27;19(1):68. doi: 10.1186/s12859-018-2057-x.
Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software.
To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project.
Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl .
遗传数据中的数量性状基因座 (QTL) 映射通常涉及相关观测值的分析,需要对其进行分析以避免虚假关联信号。这通常通过在线性混合模型 (LMM) 中将此类相关性建模为随机效应来完成。R 包 lme4 是一个成熟的工具,它使用稀疏矩阵方法实现了主要的 LMM 特征;然而,它不完全适应 QTL 映射关联和连锁研究。特别是,lme4 的基础版本缺少两个 LMM 特征:自定义协方差矩阵定义的随机效应;以及参数约束,这在高级 QTL 模型中是必不可少的。除了在相关个体的连锁研究中的应用外,在需要对多个协方差矩阵进行建模的情况下,这些功能对于关联研究非常重要,而许多全基因组关联研究 (GWAS) 软件并未涵盖这种情况。
为了解决上述限制,我们开发了一个新的 R 包 lme4qtl,作为 lme4 的扩展。首先,lme4qtl 为遗传研究提供了新的模型,这些模型集成在 lme4 及其配套包中。其次,lme4qtl 为具有多个相关性水平的情况提供了灵活的框架,并且在协方差矩阵稀疏时效率很高。我们使用 Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) 项目中的真实基于家庭的数据展示了我们软件包的价值。
我们的软件 lme4qtl 使 QTL 映射模型具有灵活的随机效应结构和稀疏协方差的高效计算。lme4qtl 可在 https://github.com/variani/lme4qtl 获得。