Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA.
Biometrics. 2022 Sep;78(3):974-987. doi: 10.1111/biom.13465. Epub 2021 May 4.
Compositional data are common in many fields, both as outcomes and predictor variables. The inventory of models for the case when both the outcome and predictor variables are compositional is limited, and the existing models are often difficult to interpret in the compositional space, due to their use of complex log-ratio transformations. We develop a transformation-free linear regression model where the expected value of the compositional outcome is expressed as a single Markov transition from the compositional predictor. Our approach is based on estimating equations thereby not requiring complete specification of data likelihood and is robust to different data-generating mechanisms. Our model is simple to interpret, allows for 0s and 1s in both the compositional outcome and covariates, and subsumes several interesting subcases of interest. We also develop permutation tests for linear independence and equality of effect sizes of two components of the predictor. Finally, we show that despite its simplicity, our model accurately captures the relationship between compositional data using two datasets from education and medical research.
在许多领域,无论是作为结果变量还是预测变量,组合数据都很常见。同时包含结果变量和预测变量的组合数据的模型清单有限,而且由于使用了复杂的对数比变换,现有的模型在组合空间中往往难以解释。我们开发了一种无变换的线性回归模型,其中组合结果的期望值表示为从组合预测变量的单个马尔可夫转移。我们的方法基于估计方程,因此不需要完全指定数据似然,并且对不同的数据生成机制具有鲁棒性。我们的模型易于解释,允许组合结果和协变量中同时出现 0 和 1,并包含几个有趣的子案例。我们还为预测器的两个分量的线性独立性和效应大小的相等性开发了置换检验。最后,我们表明,尽管我们的模型很简单,但它使用来自教育和医学研究的两个数据集准确地捕捉了组合数据之间的关系。