Schatzkin A
Nutritional Epidemiology Branch, National Cancer Institute, Bethesda, Maryland, USA.
Hematol Oncol Clin North Am. 2000 Aug;14(4):887-905. doi: 10.1016/s0889-8588(05)70317-8.
Because studies with surrogate cancer endpoints can be smaller, faster, and substantially less expensive than those with frank cancer outcomes, the use of surrogate endpoints is undeniably attractive. This attractiveness is likely to grow in coming years as the rapidly advancing discoveries in cell and molecular biology generate new therapies requiring testing and new markers that could plausibly serve as surrogates for cancer. Surrogate endpoint studies can certainly be suggestive. They continue to play a legitimate role in phase II studies, and they may give the right answers about intervention effects on or exposure associations with cancer. The problem is the uncertainty attached to most potential surrogates. Except for those few surrogates that are both necessary for and developmentally relatively close to cancer, the existence of plausible alternative pathways makes inferences about cancer from many surrogates problematic. Merely being on the causal pathway to cancer does not in itself constitute surrogate validity. It is the totality of causal connections that is critical. There is, unfortunately, a fairly extensive history of quite plausible surrogate markers giving the wrong answer about various chronic disease therapies. There is no reason to believe that cancer surrogacy is immune to such inferential difficulties. This article is, in part, an invitation, even a plea, for researchers to carry out the investigations necessary to evaluate potential surrogates, particularly surrogate-cancer studies and intervention or exposure-surrogate-cancer mediation analyses. Such studies are needed to generalize from surrogate endpoint findings to cancer. There is, however, an implicit and perhaps unavoidable irony here: the large, long, expensive studies required to evaluate potential surrogates fully are precisely the studies that surrogates were designed to replace. The exposure dependence alluded to earlier complicates matters further: establishing validity for a given surrogate for one intervention or exposure vis-à-vis cancer does not necessarily translate into validity for another intervention or exposure. One can enhance the inferential strength of surrogacy by using further "downstream" markers. Results of trials with CIN3 as an endpoint are arguably more persuasive than those from intervention studies with HPV infection endpoints. Similarly, one could consider only the advanced adenoma (> or = 1 cm, villous elements, or high-grade dysplasia) as the primary endpoint in adenoma recurrence trials. The inferential gain, however, comes with substantial costs: studies with CIN3 endpoints must be much larger than those with HPV infection endpoints; adenoma recurrence trials with sufficient rates of recurrence of advanced adenomas must be five or six times larger than trials with any recurrent adenomas as endpoints. A law emerges here: in using surrogate endpoints, inferential certainty is directly associated with study cost. In other words, one gets what one pays for. The problems inherent in using surrogate endpoints need not be regarded as a cause for pessimism in cancer research. If anything, the limitations of surrogacy are reminders of the complexity of cancer causation and affirm the continued importance of large clinical trials and observational epidemiologic studies with explicit cancer endpoints.
由于使用替代癌症终点的研究可能规模更小、速度更快且成本远低于那些以明确癌症结局为指标的研究,所以使用替代终点无疑具有吸引力。随着细胞和分子生物学领域的迅速发展带来了需要进行测试的新疗法以及有望作为癌症替代指标的新标志物,这种吸引力在未来几年可能会进一步增强。替代终点研究当然可能具有启发性。它们在II期研究中继续发挥着合理的作用,并且可能会给出关于干预对癌症的影响或暴露与癌症关联的正确答案。问题在于大多数潜在替代指标都存在不确定性。除了少数那些对于癌症发生既必要且在发展过程中相对接近癌症的替代指标外,存在合理的替代途径使得从许多替代指标推断癌症情况存在问题。仅仅处于癌症的因果路径上本身并不构成替代指标的有效性。关键在于因果联系的整体情况。不幸的是,有相当多历史案例表明,看似合理的替代标志物在各种慢性病治疗方面给出了错误答案。没有理由相信癌症替代指标能免于此类推断难题。本文在一定程度上是一种邀请,甚至是呼吁,恳请研究人员开展必要的调查以评估潜在的替代指标,特别是替代癌症研究以及干预或暴露 - 替代癌症中介分析。需要此类研究以便将替代终点的研究结果推广至癌症情况。然而,这里存在一种隐含的、或许不可避免的讽刺:全面评估潜在替代指标所需的大规模、长期且昂贵的研究恰恰是替代指标原本旨在取代的研究类型。前文提到的暴露依赖性使情况更加复杂:确定某一给定替代指标对于一种干预或暴露相对于癌症的有效性,并不一定意味着对另一种干预或暴露也有效。通过使用更“下游”的标志物可以增强替代指标推断的力度。以CIN3作为终点的试验结果可能比以HPV感染为终点的干预研究结果更具说服力。同样,在腺瘤复发试验中,可以仅将高级别腺瘤(≥1厘米、绒毛状成分或高级别发育异常)视为主要终点。然而,这种推断力度的提升伴随着巨大成本:以CIN3为终点的研究规模必须远大于以HPV感染为终点的研究;具有足够高级别腺瘤复发率的腺瘤复发试验规模必须比以任何复发性腺瘤为终点的试验大五到六倍。这里出现了一条规律:在使用替代终点时,推断的确定性与研究成本直接相关。换句话说,一分钱一分货。使用替代终点所固有的问题不一定应被视为癌症研究中悲观情绪的根源。如果说有什么不同的话,替代指标的局限性提醒人们癌症病因的复杂性,并肯定了以明确癌症终点的大型临床试验和观察性流行病学研究的持续重要性。