Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, NewYork, NY, USA.
Data Science Institute, Columbia University and Columbia University Medical Center, New York, NY, USA and Applied Physics and Applied Mathematics, Columbia University and Columbia University Medical Center, New York, NY, USA.
Biostatistics. 2022 Apr 13;23(2):643-665. doi: 10.1093/biostatistics/kxaa047.
Personalized cancer treatments based on the molecular profile of a patient's tumor are an emerging and exciting class of treatments in oncology. As genomic tumor profiling is becoming more common, targeted treatments for specific molecular alterations are gaining traction. To discover new potential therapeutics that may apply to broad classes of tumors matching some molecular pattern, experimentalists and pharmacologists rely on high-throughput, in vitro screens of many compounds against many different cell lines. We propose a hierarchical Bayesian model of how cancer cell lines respond to drugs in these experiments and develop a method for fitting the model to real-world high-throughput screening data. Through a case study, the model is shown to capture nontrivial associations between molecular features and drug response, such as requiring both wild type TP53 and overexpression of MDM2 to be sensitive to Nutlin-3(a). In quantitative benchmarks, the model outperforms a standard approach in biology, with $\approx20%$ lower predictive error on held out data. When combined with a conditional randomization testing procedure, the model discovers markers of therapeutic response that recapitulate known biology and suggest new avenues for investigation. All code for the article is publicly available at https://github.com/tansey/deep-dose-response.
基于患者肿瘤分子谱的个性化癌症治疗是肿瘤学中一个新兴且令人兴奋的治疗类别。随着基因组肿瘤分析变得越来越普遍,针对特定分子改变的靶向治疗也越来越受到关注。为了发现可能适用于匹配某些分子模式的广泛类别的肿瘤的新潜在治疗方法,实验家和药理学家依赖于针对许多不同细胞系的许多化合物的高通量、体外筛选。我们提出了一种分层贝叶斯模型,用于描述这些实验中癌细胞系对药物的反应,并开发了一种将模型拟合到真实世界高通量筛选数据的方法。通过案例研究,该模型显示出分子特征与药物反应之间存在非平凡的关联,例如需要野生型 TP53 和 MDM2 的过表达才能对 Nutlin-3(a)敏感。在定量基准测试中,该模型在生物学中表现优于标准方法,在保留数据上的预测误差低约 20%。当与条件随机化测试程序结合使用时,该模型发现了治疗反应的标志物,这些标志物再现了已知的生物学,并为新的研究途径提供了线索。本文的所有代码都可在 https://github.com/tansey/deep-dose-response 上公开获取。