Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, United Kingdom.
Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, United Kingdom.
J Clin Epidemiol. 2023 Sep;161:140-151. doi: 10.1016/j.jclinepi.2023.07.017. Epub 2023 Aug 2.
When developing a clinical prediction model, assuming a linear relationship between the continuous predictors and outcome is not recommended. Incorrect specification of the functional form of continuous predictors could reduce predictive accuracy. We examine how continuous predictors are handled in studies developing a clinical prediction model.
We searched PubMed for clinical prediction model studies developing a logistic regression model for a binary outcome, published between July 01, 2020, and July 30, 2020.
In total, 118 studies were included in the review (18 studies (15%) assessed the linearity assumption or used methods to handle nonlinearity, and 100 studies (85%) did not). Transformation and splines were commonly used to handle nonlinearity, used in 7 (n = 7/18, 39%) and 6 (n = 6/18, 33%) studies, respectively. Categorization was most often used method to handle continuous predictors (n = 67/118, 56.8%) where most studies used dichotomization (n = 40/67, 60%). Only ten models included nonlinear terms in the final model (n = 10/18, 56%).
Though widely recommended not to categorize continuous predictors or assume a linear relationship between outcome and continuous predictors, most studies categorize continuous predictors, few studies assess the linearity assumption, and even fewer use methodology to account for nonlinearity. Methodological guidance is provided to guide researchers on how to handle continuous predictors when developing a clinical prediction model.
在开发临床预测模型时,不建议假设连续预测因子与结果之间存在线性关系。不正确指定连续预测因子的函数形式可能会降低预测准确性。我们研究了在开发临床预测模型的研究中,如何处理连续预测因子。
我们在 PubMed 中搜索了 2020 年 7 月 1 日至 2020 年 7 月 30 日期间发表的开发二项逻辑回归模型的临床预测模型研究。
共有 118 项研究纳入综述(18 项研究(15%)评估了线性假设或使用方法来处理非线性,100 项研究(85%)没有)。转换和样条线常用于处理非线性,分别在 7 项研究(n=7/18,39%)和 6 项研究(n=6/18,33%)中使用。分类是最常用于处理连续预测因子的方法(n=67/118,56.8%),其中大多数研究使用二分法(n=40/67,60%)。只有 10 个模型在最终模型中包含非线性项(n=10/18,56%)。
尽管广泛建议不要对连续预测因子进行分类或假设结果与连续预测因子之间存在线性关系,但大多数研究对连续预测因子进行分类,很少有研究评估线性假设,甚至更少的研究使用方法来考虑非线性。为指导研究人员在开发临床预测模型时如何处理连续预测因子提供了方法学指导。