估计临床试验中多个终点检验的显著性水平和效能比较。

Estimating significance level and power comparisons for testing multiple endpoints in clinical trials.

作者信息

Gong J, Pinheiro J C, DeMets D L

机构信息

Department of Biostatistics, University of Wisconsin-Madison, Madison, WI, USA.

出版信息

Control Clin Trials. 2000 Aug;21(4):313-29. doi: 10.1016/s0197-2456(00)00049-0.

DOI:10.1016/s0197-2456(00)00049-0

PMID:10913807

Abstract

Clinical trials generally include several outcome measures of interest for assessing treatment efficacy and harm. Traditionally a single measure, the primary outcome, is selected and used as the basis for the design, including sample size and power. Secondary outcomes are then generally ordered with respect to their clinical relevance and importance. While this has become the traditional paradigm, recent trials have suggested the need for additional approaches. In this setting, two outcomes are viewed as key, either one being sufficient for proof of efficacy, but with an ordering of preference. The basic question, in such cases, is how to control the overall significance level for the trial. We describe and compare two methods for testing primary and secondary endpoints, accounting for their hierarchical nature-the ordering preference. Both methods are sequential, in the sense that the secondary endpoint is only tested when the primary outcome fails to reach significance. The first method uses a global test for the combination of the primary and secondary endpoints, while the second uses a partial Bonferroni correction. Simulation results indicate that the Bonferroni adjustment method performs as well as the global test method in most cases, and even better in some cases.

摘要

临床试验通常包括几个用于评估治疗效果和危害的感兴趣的结局指标。传统上，会选择一个单一指标，即主要结局，并将其用作设计的基础，包括样本量和检验效能。然后，通常会根据次要结局的临床相关性和重要性对其进行排序。虽然这已成为传统模式，但最近的试验表明需要其他方法。在这种情况下，有两个结局被视为关键，任何一个都足以证明疗效，但有偏好顺序。在这种情况下，基本问题是如何控制试验的总体显著性水平。我们描述并比较了两种用于检验主要和次要终点的方法，同时考虑了它们的层次性质——排序偏好。这两种方法都是序贯的，即只有在主要结局未达到显著性时才检验次要终点。第一种方法对主要和次要终点的组合使用全局检验，而第二种方法使用部分Bonferroni校正。模拟结果表明，Bonferroni调整方法在大多数情况下与全局检验方法表现相当，在某些情况下甚至更好。