1 Hunter Holmes McGuire VA Medical Center, Richmond, VA.
2 University of Michigan, Ann Arbor, MI.
J Clin Oncol. 2019 May 10;37(14):1209-1216. doi: 10.1200/JCO.18.01074. Epub 2019 Mar 21.
Comparative efficacy research performed using population registries can be subject to significant bias. There is an absence of objective data demonstrating factors that can sufficiently reduce bias and provide accurate results.
MEDLINE was searched from January 2000 to October 2016 for observational studies comparing two treatment regimens for any diagnosis of cancer, using SEER, SEER-Medicare, or the National Cancer Database. Reporting quality and statistical methods were assessed using components of the STROBE criteria. Randomized trials comparing the same treatment regimens were identified. Primary outcome was correlation between survival hazard ratio (HR) estimates provided by the observational studies and randomized trials. Secondary outcomes included agreement between matched pairs and predictors of agreement.
Of 3,657 studies reviewed, 350 treatment comparisons met eligibility criteria and were matched to 121 randomized trials. There was no significant correlation between the HR estimates reported by observational studies and randomized trials (concordance correlation coefficient, 0.083; 95% CI, -0.068 to 0.230). Forty percent of matched studies were in agreement regarding treatment effects (κ, 0.037; 95% CI, -0.027 to 0.1), and 62% of the observational study HRs fell within the 95% CIs of the randomized trials. Cancer type, data source, reporting quality, adjustment for age, stage, or comorbidities, use of propensity weighting, instrumental variable or sensitivity analysis, and well-matched study population did not predict agreement.
We were unable to identify any modifiable factor present in population-based observational studies that improved agreement with randomized trials. There was no agreement beyond what is expected by chance, regardless of reporting quality or statistical rigor of the observational study. Future work is needed to identify reliable methods for conducting population-based comparative efficacy research.
使用人群登记处进行的比较疗效研究可能存在重大偏倚。缺乏能够充分减少偏倚并提供准确结果的客观数据。
从 2000 年 1 月至 2016 年 10 月,通过 SEER、SEER-Medicare 或国家癌症数据库,在 MEDLINE 中搜索比较任何癌症诊断的两种治疗方案的观察性研究。使用 STROBE 标准的组成部分评估报告质量和统计方法。确定了比较相同治疗方案的随机试验。主要结局是观察性研究和随机试验提供的生存风险比(HR)估计值之间的相关性。次要结局包括匹配对之间的一致性和一致性的预测因素。
在审查的 3657 项研究中,有 350 项治疗比较符合入选标准,并与 121 项随机试验相匹配。观察性研究报告的 HR 估计值与随机试验之间没有显著相关性(一致性相关系数,0.083;95%CI,-0.068 至 0.230)。40%的匹配研究在治疗效果上达成一致(κ,0.037;95%CI,-0.027 至 0.1),62%的观察性研究 HR 在随机试验的 95%CI 内。癌症类型、数据源、报告质量、年龄、分期或合并症的调整、倾向评分加权、工具变量或敏感性分析以及匹配良好的研究人群并不能预测一致性。
我们无法确定在基于人群的观察性研究中存在任何可改变的因素,这些因素可以提高与随机试验的一致性。无论观察性研究的报告质量或统计严谨性如何,一致性都超出了预期。需要进一步研究以确定进行基于人群的比较疗效研究的可靠方法。