Wason J M S, Dentamaro A, Eisen T G
MRC Biostatistics Unit, Cambridge, United Kingdom.
Department of Brain and Behavioral Sciences, University of Pavia, Italy.
Eur J Cancer. 2015 May;51(8):984-92. doi: 10.1016/j.ejca.2015.03.002. Epub 2015 Mar 31.
The high failure rate in phase III oncology trials is partly because the signal obtained from phase II trials is often weak. Several papers have considered the appropriateness of various phase II end-points for individual trials, but there has not been a systematic comparison using simulated data to determine which end-point should be used in which situation.
In this paper we carry out simulation studies to compare the power of several Response Evaluation Criteria in Solid Tumours (RECIST) response-based end-points for one-arm and two-arm trials, together with progression-free survival (PFS) and testing the tumour-shrinkage directly for two-arm trials. We consider six scenarios: (1) short-term cytotoxic therapy; (2) continuous cytotoxic therapy; (3+4) cytostatic therapy; (5+6) delayed tumour-shrinkage effect (seen in some immunotherapies). We also consider measurement error in the assessment of tumour size.
Measurement error affects the type-I error rate and power of single-arm trials, and the power of two-arm trials. Generally no single end-point performed well in all scenarios. Best observed response rate, PFS and directly testing the tumour-shrinkages performed best for a number of scenarios. PFS performed very poorly when the effect of the treatment was short-lived. In scenario 6, where the delay in effect was long, no end-point performed well.
A clinician setting up a phase II trial should consider the likely mechanism of action the drug will have and choose an end-point that provides high power for that scenario. Testing the difference in tumour-shrinkage is often powerful. Alternative end-points are required for therapies with a long delayed effect.
肿瘤学III期试验的高失败率部分原因是II期试验获得的信号通常较弱。几篇论文考虑了各个试验中各种II期终点的适用性,但尚未使用模拟数据进行系统比较以确定在何种情况下应使用哪种终点。
在本文中,我们进行模拟研究,比较基于实体瘤疗效评价标准(RECIST)反应的几个终点在单臂和双臂试验中的检验效能,以及无进展生存期(PFS),并直接对双臂试验中的肿瘤缩小情况进行检验。我们考虑六种情况:(1)短期细胞毒性疗法;(2)持续细胞毒性疗法;(3 + 4)细胞周期非特异性疗法;(5 + 6)延迟肿瘤缩小效应(在某些免疫疗法中可见)。我们还考虑了肿瘤大小评估中的测量误差。
测量误差影响单臂试验的I型错误率和检验效能,以及双臂试验的检验效能。一般来说,没有一个单一的终点在所有情况下都表现良好。最佳观察缓解率、PFS以及直接检验肿瘤缩小情况在许多情况下表现最佳。当治疗效果短暂存在时,PFS表现非常差。在情况6中,效应延迟时间长,没有终点表现良好。
开展II期试验的临床医生应考虑药物可能的作用机制,并选择在该情况下具有高检验效能的终点。检验肿瘤缩小差异通常具有较高效能。对于具有长延迟效应的疗法,需要其他终点。