Lu Min, Sadiq Saad, Feaster Daniel J, Ishwaran Hemant
Division of Biostatistics, University of Miami, Coral Gables, FL.
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL.
J Comput Graph Stat. 2018;27(1):209-219. doi: 10.1080/10618600.2017.1356325. Epub 2018 Feb 1.
Estimation of individual treatment effect in observational data is complicated due to the challenges of confounding and selection bias. A useful inferential framework to address this is the counterfactual (potential outcomes) model, which takes the hypothetical stance of asking what if an individual had received treatments. Making use of random forests (RF) within the counterfactual framework we estimate individual treatment effects by directly modeling the response. We find that accurate estimation of individual treatment effects is possible even in complex heterogenous settings but that the type of RF approach plays an important role in accuracy. Methods designed to be adaptive to confounding, when used in parallel with out-of-sample estimation, do best. One method found to be especially promising is counterfactual synthetic forests. We illustrate this new methodology by applying it to a large comparative effectiveness trial, Project Aware, to explore the role drug use plays in sexual risk. The analysis reveals important connections between risky behavior, drug usage, and sexual risk.
由于混杂因素和选择偏倚的挑战,在观察性数据中估计个体治疗效果很复杂。解决此问题的一个有用的推理框架是反事实(潜在结果)模型,该模型采取假设的立场,即询问如果个体接受了治疗会怎样。在反事实框架内利用随机森林(RF),我们通过直接对反应进行建模来估计个体治疗效果。我们发现,即使在复杂的异质环境中,准确估计个体治疗效果也是可能的,但RF方法的类型在准确性方面起着重要作用。设计为适应混杂因素的方法,与样本外估计并行使用时效果最佳。一种特别有前景的方法是反事实合成森林。我们通过将这种新方法应用于一项大型比较效果试验“清醒计划”,以探索药物使用在性风险中所起的作用,来说明这种新方法。分析揭示了危险行为、药物使用和性风险之间的重要联系。