基于树的分步调整强化学习在使用测试和治疗观察数据评估嵌套动态治疗方案中的应用。

Step-adjusted tree-based reinforcement learning for evaluating nested dynamic treatment regimes using test-and-treat observational data.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.

The James Buchanan Brady Urological Institute and Department of Urology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.

出版信息

Stat Med. 2021 Nov 30;40(27):6164-6177. doi: 10.1002/sim.9177. Epub 2021 Sep 7.

DOI:10.1002/sim.9177

PMID:34490942

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8595655/

Abstract

Dynamic treatment regimes (DTRs) include a sequence of treatment decision rules, in which treatment is adapted over time in response to the changes in an individual's disease progression and health care history. In medical practice, nested test-and-treat strategies are common to improve cost-effectiveness. For example, for patients at risk of prostate cancer, only patients who have high prostate-specific antigen (PSA) need a biopsy, which is costly and invasive, to confirm the diagnosis and help determine the treatment if needed. A decision about treatment happens after the biopsy, and is thus nested within the decision of whether to do the test. However, current existing statistical methods are not able to accommodate such a naturally embedded property of the treatment decision within the test decision. Therefore, we developed a new statistical learning method, step-adjusted tree-based reinforcement learning, to evaluate DTRs within such a nested multistage dynamic decision framework using observational data. At each step within each stage, we combined the robust semiparametric estimation via augmented inverse probability weighting with a tree-based reinforcement learning method to deal with the counterfactual optimization. The simulation studies demonstrated robust performance of the proposed methods under different scenarios. We further applied our method to evaluate the necessity of prostate biopsy and identify the optimal test-and-treat regimes for prostate cancer patients using data from the Johns Hopkins University prostate cancer active surveillance dataset.

摘要

动态治疗方案（DTR）包括一系列治疗决策规则，其中治疗会随着个体疾病进展和医疗史的变化而进行调整。在医疗实践中，嵌套式测试和治疗策略常用于提高成本效益。例如，对于有前列腺癌风险的患者，只有前列腺特异性抗原（PSA）较高的患者需要进行活检来确诊，并在必要时帮助确定治疗方案，而活检既昂贵又具侵入性。治疗决策是在活检后做出的，因此嵌套在是否进行测试的决策中。然而，当前现有的统计方法无法适应测试决策中治疗决策的这种自然嵌入特性。因此，我们开发了一种新的统计学习方法，即逐步调整基于树的强化学习，以使用观察数据在这种嵌套的多阶段动态决策框架内评估 DTR。在每个阶段的每个步骤中，我们将增强逆概率加权的稳健半参数估计与基于树的强化学习方法相结合，以处理反事实优化问题。模拟研究表明，在不同情况下，所提出的方法具有稳健的性能。我们进一步应用我们的方法来评估前列腺活检的必要性，并使用约翰霍普金斯大学前列腺癌主动监测数据集来确定前列腺癌患者的最佳测试和治疗方案。