Department of Decision Sciences, HEC Montréal, Montréal, QC H3T 2A7, Canada.
Bioinformatics. 2020 Jan 15;36(2):629-636. doi: 10.1093/bioinformatics/btz602.
Personalized medicine often relies on accurate estimation of a treatment effect for specific subjects. This estimation can be based on the subject's baseline covariates but additional complications arise for a time-to-event response subject to censoring. In this paper, the treatment effect is measured as the difference between the mean survival time of a treated subject and the mean survival time of a control subject. We propose a new random forest method for estimating the individual treatment effect with survival data. The random forest is formed by individual trees built with a splitting rule specifically designed to partition the data according to the individual treatment effect. For a new subject, the forest provides a set of similar subjects from the training dataset that can be used to compute an estimation of the individual treatment effect with any adequate method.
The merits of the proposed method are investigated with a simulation study where it is compared to numerous competitors, including recent state-of-the-art methods. The results indicate that the proposed method has a very good and stable performance to estimate the individual treatment effects. Two examples of application with a colon cancer data and breast cancer data show that the proposed method can detect a treatment effect in a sub-population even when the overall effect is small or nonexistent.
The authors are working on an R package implementing the proposed method and it will be available soon. In the meantime, the code can be obtained from the first author at sami.tabib@hec.ca.
Supplementary data are available at Bioinformatics online.
个性化医学通常依赖于对特定个体的治疗效果的准确估计。这种估计可以基于个体的基线协变量,但对于受到删失的生存时间反应,会出现额外的复杂情况。在本文中,治疗效果被测量为处理组个体的平均生存时间与对照组个体的平均生存时间之间的差异。我们提出了一种新的随机森林方法,用于估计生存数据的个体治疗效果。随机森林由个体树组成,这些个体树是根据特定的分裂规则构建的,该规则专门用于根据个体治疗效果对数据进行分割。对于新个体,森林提供了一组来自训练数据集的相似个体,可用于使用任何适当的方法计算个体治疗效果的估计值。
通过模拟研究调查了所提出方法的优点,该方法与许多竞争对手进行了比较,包括最新的最先进的方法。结果表明,该方法在估计个体治疗效果方面具有非常好且稳定的性能。使用结肠癌数据和乳腺癌数据的两个应用示例表明,即使整体效果较小或不存在,该方法也可以在亚人群中检测到治疗效果。
作者正在开发一个实现所提出方法的 R 包,该包将很快可用。同时,可以从作者 sami.tabib@hec.ca 处获得代码。
补充数据可在 Bioinformatics 在线获取。