Chappell Mary, Watkins Deborah, Sanderson Alice, di Ruffano Lavinia Ferrante, Miller Paul, Fewster Hariet, Fitzgerald Anita, Edwards Mary, McCool Rachael
York Health Economics Consortium, Innovation Way University of York York UK.
Cochrane Evid Synth Methods. 2025 Jan 16;3(1):e70016. doi: 10.1002/cesm.70016. eCollection 2025 Jan.
Interventional single-arm trials (SATs) are increasingly being used as evidence, despite a lack of agreement on their validity and where they should sit in the hierarchy of evidence. We conducted a meta-epidemiological study to investigate whether there are systematic differences in outcomes and levels of between-study heterogeneity for SATs compared with their observational counterpart, single-arm cohort studies.
We identified systematic reviews (SRs) of pharmacological interventions, published in 2023, that included both interventional and observational single-arm studies. For each SR, subgroup meta-analysis of dichotomous outcomes was conducted for included SATs and single-arm cohort studies to assess effect sizes, levels of heterogeneity and between group differences. In a sensitivity analysis, clinically heterogeneous primary studies were removed and analyses re-run.
66 SRs contained single-arm studies, of which 13 reported meta-analyses of dichotomous efficacy outcomes. There was no overall risk difference for SATs compared with single-arm cohort studies (risk difference: -0.020, 95% CI: -0.092 to 0.052, = 0.59). In the sensitivity analysis, there was a tendency to higher effect for single-arm cohort studies, but no significant difference (risk difference: -0.071, 95% CI: -0.161, 0.019, = 0.12). There were high levels of between-study heterogeneity within both SATs (median; range : 54.8; 11.3-91.0) and single-arm cohorts (median; range : 77.2; 0-94.7) and heterogeneity remained high in the sensitivity analysis.
There do not appear to be systematic differences in outcome between SATs and single-arm cohort studies, but further research is recommended to confirm this finding. Levels of heterogeneity are high within both designs, even after attempts to reduce clinical heterogeneity. Because clinical heterogeneity had potentially been removed, remaining statistical heterogeneity may have been due to bias related to study conduct. Future work should utilize larger samples and additional methods to further clarify the relative validity of single-arm designs.
尽管对于干预性单臂试验(SATs)的有效性以及它们在证据等级体系中的位置尚未达成共识,但这类试验正越来越多地被用作证据。我们开展了一项meta流行病学研究,以调查SATs与其观察性对应研究(单臂队列研究)相比,在研究结果和研究间异质性水平上是否存在系统性差异。
我们检索了2023年发表的关于药物干预的系统评价(SRs),这些系统评价纳入了干预性和观察性单臂研究。对于每项系统评价,对纳入的SATs和单臂队列研究进行二分结局的亚组meta分析,以评估效应大小、异质性水平和组间差异。在敏感性分析中,剔除临床异质性高的原始研究并重新进行分析。
66项系统评价包含单臂研究,其中13项报告了二分疗效结局的meta分析。与单臂队列研究相比,SATs总体上没有风险差异(风险差异:-0.020,95%CI:-0.092至0.052,P = 0.59)。在敏感性分析中,单臂队列研究有效应更高的趋势,但无显著差异(风险差异:-0.071,95%CI:-0.161,0.019,P = 0.12)。SATs(中位数;范围:54.8;11.3 - 91.0)和单臂队列(中位数;范围:77.2;0 - 94.7)的研究间异质性水平都很高,敏感性分析中异质性仍然很高。
SATs和单臂队列研究在结局上似乎没有系统性差异,但建议进一步研究以证实这一发现。即使在尝试减少临床异质性之后,两种设计的异质性水平都很高。由于临床异质性可能已被消除,剩余的统计异质性可能是由于与研究实施相关的偏倚所致。未来的工作应使用更大的样本和其他方法,以进一步阐明单臂设计的相对有效性。