Centre for Statistics in Medicine, Wolfson College Annexe, University of Oxford, Linton Road, Oxford OX2 6UD, UK.
BMC Med. 2010 Mar 30;8:21. doi: 10.1186/1741-7015-8-21.
Appropriate choice and use of prognostic models in clinical practice require the use of good methods for both model development, and for developing prognostic indices and risk groups from the models. In order to assess reliability and generalizability for use, models need to have been validated and measures of model performance reported. We reviewed published articles to assess the methods and reporting used to develop and evaluate performance of prognostic indices and risk groups from prognostic models.
We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.
In 47 studies, Cox models were used in 94% (44), but the coefficients or hazard ratios for the variables in the final model were reported in only 72% (34). The reproducibility of the derived model was assessed in only 11% (5) of the articles. A prognostic index was developed from the model in 81% (38) of the articles, but researchers derived the prognostic index from the final prognostic model in only 34% (13) of the studies; different coefficients or variables from those in the final model were used in 50% (19) of models and the methods used were unclear in 16% (6) of the articles. Methods used to derive prognostic groups were also poor, with researchers not reporting the methods used in 39% (14 of 36) of the studies and data derived methods likely to bias estimates of differences between risk groups being used in 28% (10) of the studies. Validation of their models was reported in only 34% (16) of the studies. In 15 studies validation used data from the same population and in five studies from a different population. Including reports of validation with external data from publications up to four years following model development, external validation was attempted for only 21% (10) of models. Insufficient information was provided on the performance of models in terms of discrimination and calibration.
Many published prognostic models have been developed using poor methods and many with poor reporting, both of which compromise the reliability and clinical relevance of models, prognostic indices and risk groups derived from them.
在临床实践中,适当选择和使用预后模型需要使用良好的方法来开发模型,以及从模型中开发预后指标和风险组。为了评估可靠性和通用性,模型需要经过验证,并报告模型性能的衡量标准。我们回顾了已发表的文章,以评估用于开发和评估预后模型的预后指标和风险组的方法和报告。
我们开发了一个系统搜索字符串,并从 PubMed 中确定了文章。有 47 篇文章符合以下纳入标准:发表于 2005 年;旨在预测患者的预后;提出新的癌症预后模型,以事件发生的时间为终点,并包含至少两个单独变量的组合;并使用适合时间事件数据的多变量分析分析数据。
在 47 项研究中,94%(44 项)使用了 Cox 模型,但仅 72%(34 项)报告了最终模型中变量的系数或风险比。只有 11%(5 项)的文章评估了衍生模型的可重复性。81%(38 项)的文章从模型中开发了预后指标,但研究人员仅在 34%(13 项)的研究中从最终预后模型中推导预后指标;50%(19 项)的模型使用了与最终模型不同的系数或变量,16%(6 项)的文章中使用的方法不明确。用于推导预后组的方法也很差,研究人员未报告 39%(36 项研究中的 14 项)研究中使用的方法,并且在 28%(36 项研究中的 10 项)研究中使用了可能导致风险组之间差异估计值偏倚的数据推导方法。只有 34%(16 项)的研究报告了模型的验证。在 15 项研究中,验证使用了来自同一人群的数据,在 5 项研究中使用了来自不同人群的数据。将出版物发表后四年内的外部数据验证报告包括在内,只有 21%(10 项)的模型进行了外部验证。模型在区分度和校准度方面的性能提供的信息不足。
许多已发表的预后模型的开发方法较差,报告也较差,这两者都降低了模型、从模型中得出的预后指标和风险组的可靠性和临床相关性。