Department of Biostatistics, University of Michigan, 1415 Washington Heights, 48109 Ann Arbor, MI, USA.
Int J Biostat. 2021 Apr 6;18(1):57-72. doi: 10.1515/ijb-2020-0151.
In regression models, predictor variables with inherent ordering, such ECOG performance status or novel biomarker expression levels, are commonly seen in medical settings. Statistically, it may be difficult to determine the functional form of an ordinal predictor variable. Often, such a variable is dichotomized based on whether it is above or below a certain cutoff. Other methods conveniently treat the ordinal predictor as a continuous variable and assume a linear relationship with the outcome. However, arbitrarily choosing a method may lead to inaccurate inference and treatment. In this paper, we propose a Bayesian mixture model to consider both dichotomous and linear forms for the variable. This allows for simultaneous assessment of the appropriate form of the predictor in regression models by considering the presence of a changepoint through the lens of a threshold detection problem. This method is applicable to continuous, binary, and survival outcomes, and it is easily amenable to penalized regression. We evaluated the proposed method using simulation studies and apply it to two real datasets. We provide JAGS code for easy implementation.
在回归模型中,医学环境中常见的具有内在顺序的预测变量,如 ECOG 表现状态或新型生物标志物表达水平。从统计学的角度来看,确定有序预测变量的函数形式可能具有一定难度。通常,此类变量可根据其是否高于或低于某个截断值将其分为两类。其他方法则方便地将有序预测变量视为连续变量,并假设其与结果呈线性关系。然而,任意选择方法可能会导致推断和处理不准确。在本文中,我们提出了一种贝叶斯混合模型,以同时考虑变量的二分和线性形式。这通过通过阈值检测问题的视角来考虑存在变化点,从而允许在回归模型中同时评估预测器的适当形式。该方法适用于连续、二项和生存结果,并且易于进行惩罚回归。我们使用模拟研究评估了所提出的方法,并将其应用于两个真实数据集。我们提供了 JAGS 代码以方便实现。