利用医疗保健数据库中的观察数据比较估计异质治疗效果的方法。

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases.

机构信息

Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.

Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, USA.

出版信息

Stat Med. 2018 Oct 15;37(23):3309-3324. doi: 10.1002/sim.7820. Epub 2018 Jun 3.

DOI:10.1002/sim.7820

PMID:29862536

Abstract

There is growing interest in using routinely collected data from health care databases to study the safety and effectiveness of therapies in "real-world" conditions, as it can provide complementary evidence to that of randomized controlled trials. Causal inference from health care databases is challenging because the data are typically noisy, high dimensional, and most importantly, observational. It requires methods that can estimate heterogeneous treatment effects while controlling for confounding in high dimensions. Bayesian additive regression trees, causal forests, causal boosting, and causal multivariate adaptive regression splines are off-the-shelf methods that have shown good performance for estimation of heterogeneous treatment effects in observational studies of continuous outcomes. However, it is not clear how these methods would perform in health care database studies where outcomes are often binary and rare and data structures are complex. In this study, we evaluate these methods in simulation studies that recapitulate key characteristics of comparative effectiveness studies. We focus on the conditional average effect of a binary treatment on a binary outcome using the conditional risk difference as an estimand. To emulate health care database studies, we propose a simulation design where real covariate and treatment assignment data are used and only outcomes are simulated based on nonparametric models of the real outcomes. We apply this design to 4 published observational studies that used records from 2 major health care databases in the United States. Our results suggest that Bayesian additive regression trees and causal boosting consistently provide low bias in conditional risk difference estimates in the context of health care database studies.

摘要

人们越来越感兴趣的是利用医疗保健数据库中常规收集的数据，在“真实世界”条件下研究治疗方法的安全性和有效性，因为它可以为随机对照试验的证据提供补充。从医疗保健数据库中进行因果推断具有挑战性，因为这些数据通常是嘈杂的、高维的，最重要的是，是观察性的。这需要能够在高维环境中控制混杂因素的同时估计异质治疗效果的方法。贝叶斯加法回归树、因果森林、因果提升和因果多元自适应回归样条是现成的方法，它们在连续结果的观察性研究中对异质治疗效果的估计表现出良好的性能。然而，尚不清楚这些方法在医疗保健数据库研究中的表现如何，因为这些研究中的结果通常是二分类的、罕见的，并且数据结构复杂。在这项研究中，我们在模拟研究中评估了这些方法，这些模拟研究再现了比较疗效研究的关键特征。我们关注的是二分类治疗对二分类结果的条件平均效应，使用条件风险差作为估计量。为了模拟医疗保健数据库研究，我们提出了一种模拟设计，其中使用真实的协变量和治疗分配数据，仅根据真实结果的非参数模型模拟结果。我们将此设计应用于 4 项已发表的观察性研究，这些研究使用了来自美国 2 个主要医疗保健数据库的记录。我们的结果表明，在医疗保健数据库研究的背景下，贝叶斯加法回归树和因果提升始终能提供条件风险差估计的低偏差。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用医疗保健数据库中的观察数据比较估计异质治疗效果的方法。

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases.

机构信息

出版信息

相似文献

引用本文的文献

利用医疗保健数据库中的观察数据比较估计异质治疗效果的方法。

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases.

机构信息

出版信息

相似文献

引用本文的文献