Huang Ying
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of biostatistics, University of Washington, Seattle, WA, USA
Clin Trials. 2015 Aug;12(4):348-56. doi: 10.1177/1740774515580126. Epub 2015 May 6.
BACKGROUND/AIMS: Biomarkers associated with treatment-effect heterogeneity can be used to make treatment recommendations that optimize individual clinical outcomes. To accomplish this, statistical methods are needed to generate marker-based treatment-selection rules that can most effectively reduce the population burden due to disease and treatment. Compared to the standard approach of risk modeling to derive treatment-selection rules, a more robust approach is to directly minimize an unbiased estimate of total disease and treatment burden among a pre-specified class of rules. This problem is one of minimizing a weighted sum of 0-1 loss function, which is computationally challenging to solve due to the nonsmoothness of 0-1 loss. Huang and Fong, among others, proposed a method that uses the Ramp loss to approximate the 0-1 loss and solves the minimization problem through repetitive constrained optimizations. The algorithm was shown to have comparable or better performance than other comparative estimators in various settings. Our aim in this article is to further extend the algorithm to allow for variable selection in the presence of a large number of candidate markers.
We develop an alternative method to derive marker combinations to minimize the weighted sum of Ramp loss in Huang and Fong, based on data from randomized trials. The new algorithm estimates treatment-selection rules by repetitively minimizing a smooth and differentiable objective function. Through the use of an L1 penalty, we expand the method to allow for feature selection and develop an algorithm based on the coordinate descent method to build the treatment-selection rule.
Through extensive simulation studies, we compared performance of the proposed estimator to four existing approaches: (1) a logistic regression risk modeling approach, and three other "direct optimizing" approaches including (2) the estimator in Huang and Fong, (3) the weighted support vector machine, and (4) the weighted logistic regression. The proposed estimator performs comparably to that of Huang and Fong, and comparably or better than other estimators. Allowing for variable selection using the proposed estimator in the presence of a large number of markers further improves treatment-selection performance. The proposed estimator is also advantageous for selecting variables relevant to treatment selection compared to L1 penalized logistic regression and weighted logistic regression. We illustrate the application of the proposed methods in host-genetics data from an HIV vaccine trial.
The proposed estimator is appealing considering its effectiveness and conceptual simplicity. It has significant potential to contribute to the selection and combination of biomarkers for treatment selection in clinical practice.
背景/目的:与治疗效果异质性相关的生物标志物可用于做出优化个体临床结局的治疗推荐。要实现这一点,需要统计方法来生成基于标志物的治疗选择规则,从而最有效地减轻疾病和治疗给人群带来的负担。与通过风险建模推导治疗选择规则的标准方法相比,一种更稳健的方法是直接在预先指定的一类规则中最小化疾病和治疗总负担的无偏估计。这个问题是最小化0 - 1损失函数的加权和,由于0 - 1损失的非光滑性,求解起来在计算上具有挑战性。Huang和Fong等人提出了一种方法,该方法使用斜坡损失来近似0 - 1损失,并通过重复的约束优化来解决最小化问题。在各种情况下,该算法表现出与其他比较估计器相当或更好的性能。本文的目的是进一步扩展该算法,以便在存在大量候选标志物的情况下进行变量选择。
我们基于随机试验的数据,开发了一种替代方法来推导标志物组合,以最小化Huang和Fong方法中斜坡损失的加权和。新算法通过重复最小化一个光滑且可微的目标函数来估计治疗选择规则。通过使用L1惩罚,我们扩展了该方法以允许进行特征选择,并开发了一种基于坐标下降法的算法来构建治疗选择规则。
通过广泛的模拟研究,我们将所提出估计器的性能与四种现有方法进行了比较:(1)逻辑回归风险建模方法,以及其他三种“直接优化”方法,包括(2)Huang和Fong的估计器,(3)加权支持向量机,以及(4)加权逻辑回归。所提出的估计器与Huang和Fong的估计器表现相当,并且与其他估计器相当或更好。在存在大量标志物的情况下,使用所提出的估计器进行变量选择进一步提高了治疗选择性能。与L1惩罚逻辑回归和加权逻辑回归相比,所提出的估计器在选择与治疗选择相关变量方面也具有优势。我们在一项HIV疫苗试验的宿主遗传学数据中展示了所提出方法的应用。
考虑到其有效性和概念上的简单性,所提出的估计器很有吸引力。它在临床实践中为治疗选择的生物标志物的选择和组合做出贡献方面具有巨大潜力。