Colunga-Lozano Luis Enrique, Foroutan Farid, Rayner Daniel, De Luca Christopher, Hernández-Wolters Benjamin, Couban Rachel, Ibrahim Quazi, Guyatt Gordon
Department of clinical medicine, Health science center, Universidad de Guadalajara, Guadalajara, Jalisco, México; Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada.
Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada.
J Clin Epidemiol. 2023 Oct 28. doi: 10.1016/j.jclinepi.2023.10.016.
To systematically review the comparative statistical performance (discrimination and /or calibration) of prognostic clinical prediction models (CPMs) and clinician judgment (CJ).
We conducted a systematic review of observational studies in PubMed, Medline, Embase, and CINAHL. Eligible studies reported direct statistical comparison between prognostic CPMs and CJ. Risk of bias was assessed using the PROBAST tool.
We identified 41 studies, most with high risk of bias (39 studies). Of these, 41 studies, 39 examined discrimination and 12 studies assessed calibration. Prognostic CPMs had a median AUC of 0.73 (IQR, 0.62 - 0.81), while CJ had a median AUC of 0.71 (IQR, 0.62 - 0.81). 29 studies provided 124 discrimination metrics useful for comparative analysis. Among these, 58 (46.7%) found no significant difference between prognostic CPMs and CJ (p > 0.05); 31 (25%) favored prognostic CPMs, and 35 (28.2%) favored CJ. Four studies compared calibration, showing better performance on prognostic CPMs.
In many instances CJ frequently demonstrates comparable or superior discrimination compared to prognostic CPMs, although models outperform CJ on calibration. Studies comparing performance of prognostic CPMs and CJ require large improvements in reporting.
系统评价预后临床预测模型(CPM)与临床医生判断(CJ)的比较统计性能(辨别力和/或校准度)。
我们对PubMed、Medline、Embase和CINAHL中的观察性研究进行了系统评价。符合条件的研究报告了预后CPM与CJ之间的直接统计比较。使用PROBAST工具评估偏倚风险。
我们纳入了41项研究,其中大多数存在高偏倚风险(39项研究)。在这些研究中,41项研究中有39项检验了辨别力,12项研究评估了校准度。预后CPM的中位数AUC为0.73(IQR,0.62 - 0.81),而CJ的中位数AUC为0.71(IQR,0.62 - 0.81)。29项研究提供了124个可用于比较分析的辨别力指标。其中,58项(46.7%)发现预后CPM与CJ之间无显著差异(p > 0.05);31项(25%)支持预后CPM,35项(28.2%)支持CJ。四项研究比较了校准度,结果显示预后CPM表现更好。
在许多情况下,尽管模型在校准方面优于CJ,但CJ的辨别力常常与预后CPM相当或更优。比较预后CPM与CJ性能的研究在报告方面需要大幅改进。