Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA.
Department of Biostatistics, Columbia University, New York City, NY, USA.
Stat Med. 2018 Nov 20;37(26):3776-3788. doi: 10.1002/sim.7844. Epub 2018 Jun 5.
Dynamic treatment regimens (DTRs) are sequential treatment decisions tailored by patient's evolving features and intermediate outcomes at each treatment stage. Patient heterogeneity and the complexity and chronicity of many diseases call for learning optimal DTRs that can best tailor treatment according to each individual's time-varying characteristics (eg, intermediate response over time). In this paper, we propose a robust and efficient approach referred to as Augmented Outcome-weighted Learning (AOL) to identify optimal DTRs from sequential multiple assignment randomized trials. We improve previously proposed outcome-weighted learning to allow for negative weights. Furthermore, to reduce the variability of weights for numeric stability and improve estimation accuracy, in AOL, we propose a robust augmentation to the weights by making use of predicted pseudooutcomes from regression models for Q-functions. We show that AOL still yields Fisher-consistent DTRs even if the regression models are misspecified and that an appropriate choice of the augmentation guarantees smaller stochastic errors in value function estimation for AOL than the previous outcome-weighted learning. Finally, we establish the convergence rates for AOL. The comparative advantage of AOL over existing methods is demonstrated through extensive simulation studies and an application to a sequential multiple assignment randomized trial for major depressive disorder.
动态治疗方案(DTRs)是根据患者在每个治疗阶段的不断变化的特征和中间结果进行定制的连续治疗决策。患者的异质性以及许多疾病的复杂性和慢性,需要学习最佳的 DTR,根据每个人的时变特征(例如,随时间变化的中间反应)进行最佳的治疗。在本文中,我们提出了一种稳健而有效的方法,称为增强结果加权学习(AOL),用于从序贯多分配随机试验中确定最佳 DTR。我们改进了之前提出的结果加权学习,以允许使用负权重。此外,为了降低权重的可变性以实现数值稳定性并提高估计精度,在 AOL 中,我们通过使用 Q 函数的回归模型来预测伪结果,对权重进行稳健增强。我们表明,即使回归模型存在误设,AOL 仍然可以产生 Fisher 一致的 DTR,并且适当选择增强可以保证 AOL 中值函数估计的随机误差小于之前的结果加权学习。最后,我们建立了 AOL 的收敛速度。通过广泛的模拟研究和对重度抑郁症的序贯多分配随机试验的应用,证明了 AOL 相对于现有方法的优势。