Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
F1000Res. 2020 Jun 4;9:512. doi: 10.12688/f1000research.24187.2. eCollection 2020.
Linear and generalized linear models are used extensively in many scientific fields, to model observed data and as the basis for hypothesis tests. The use of such models requires specification of a design matrix, and subsequent formulation of contrasts representing scientific hypotheses of interest. Proper execution of these steps requires a thorough understanding of the meaning of the individual coefficients, and is a frequent source of uncertainty for end users. Here, we present an R/Bioconductor package, , which enables interactive exploration of design matrices and linear model diagnostics. Given a sample data table and a desired design formula, the package displays how the model coefficients are combined to give the fitted values for each combination of predictor variables, which allows users to both extract the interpretation of each individual coefficient, and formulate desired linear contrasts. In addition, the interactive interface displays informative characteristics for the regular linear model corresponding to the provided design, such as variance inflation factors and the pseudoinverse of the design matrix. We envision the package and the built-in collection of common types of linear model designs to be useful for teaching and self-learning purposes, as well as for assisting more experienced users in the interpretation of complex model designs.
线性和广义线性模型在许多科学领域中被广泛应用,用于对观测数据进行建模,并作为假设检验的基础。此类模型的使用需要指定设计矩阵,然后制定代表感兴趣的科学假设的对比。正确执行这些步骤需要对各个系数的含义有透彻的了解,而这是最终用户经常感到不确定的地方。在这里,我们介绍了一个 R/Bioconductor 包 ,它支持对设计矩阵和线性模型诊断进行交互式探索。给定一个样本数据表和所需的设计公式,该包将显示模型系数如何组合以给出每个预测变量组合的拟合值,这允许用户提取每个单独系数的解释,并制定所需的线性对比。此外,交互式界面显示了与提供的设计相对应的常规线性模型的有用特征,例如方差膨胀因子和设计矩阵的伪逆。我们设想该软件包和内置的常见类型的线性模型设计集合对于教学和自学目的、以及帮助更有经验的用户解释复杂的模型设计都是有用的。