串联法：一种基于多种分子数据类型最大化药物反应模型可解释性的两阶段方法。

TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types.

作者信息

Aben Nanne, Vis Daniel J, Michaut Magali, Wessels Lodewyk F A

机构信息

Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Amsterdam 1066CX, The Netherlands, Faculty of EEMCS, Delft University of Technology, Delft 2628CD, The Netherlands.

Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Amsterdam 1066CX, The Netherlands.

出版信息

Bioinformatics. 2016 Sep 1;32(17):i413-i420. doi: 10.1093/bioinformatics/btw449.

DOI:10.1093/bioinformatics/btw449

PMID:27587657

Abstract

MOTIVATION

Clinical response to anti-cancer drugs varies between patients. A large portion of this variation can be explained by differences in molecular features, such as mutation status, copy number alterations, methylation and gene expression profiles. We show that the classic approach for combining these molecular features (Elastic Net regression on all molecular features simultaneously) results in models that are almost exclusively based on gene expression. The gene expression features selected by the classic approach are difficult to interpret as they often represent poorly studied combinations of genes, activated by aberrations in upstream signaling pathways.

RESULTS

To utilize all data types in a more balanced way, we developed TANDEM, a two-stage approach in which the first stage explains response using upstream features (mutations, copy number, methylation and cancer type) and the second stage explains the remainder using downstream features (gene expression). Applying TANDEM to 934 cell lines profiled across 265 drugs (GDSC1000), we show that the resulting models are more interpretable, while retaining the same predictive performance as the classic approach. Using the more balanced contributions per data type as determined with TANDEM, we find that response to MAPK pathway inhibitors is largely predicted by mutation data, while predicting response to DNA damaging agents requires gene expression data, in particular SLFN11 expression.

AVAILABILITY AND IMPLEMENTATION

TANDEM is available as an R package on CRAN (for more information, see http://ccb.nki.nl/software/tandem).

CONTACT

m.michaut@nki.nl or l.wessels@nki.nl

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

抗癌药物的临床反应在患者之间存在差异。这种差异的很大一部分可以通过分子特征的差异来解释，如突变状态、拷贝数改变、甲基化和基因表达谱。我们表明，将这些分子特征结合起来的经典方法（对所有分子特征同时进行弹性网络回归）会产生几乎完全基于基因表达的模型。经典方法选择的基因表达特征难以解释，因为它们通常代表由上游信号通路异常激活的、研究较少的基因组合。

结果

为了更均衡地利用所有数据类型，我们开发了TANDEM，这是一种两阶段方法，其中第一阶段使用上游特征（突变、拷贝数、甲基化和癌症类型）来解释反应，第二阶段使用下游特征（基因表达）来解释剩余部分。将TANDEM应用于对265种药物进行分析的934个细胞系（GDSC1000），我们表明所得模型更具可解释性，同时保持与经典方法相同的预测性能。使用TANDEM确定的每种数据类型更均衡的贡献，我们发现对MAPK通路抑制剂的反应很大程度上由突变数据预测，而预测对DNA损伤剂的反应则需要基因表达数据，特别是SLFN11的表达。

可用性和实现

TANDEM作为R包可在CRAN上获取（更多信息见http://ccb.nki.nl/software/tandem）。

联系方式

m.michaut@nki.nl或l.wessels@nki.nl

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types.

Bioinformatics. 2016 Sep 1;32(17):i413-i420. doi: 10.1093/bioinformatics/btw449.

A regression model for estimating DNA copy number applied to capture sequencing data.

Bioinformatics. 2012 Sep 15;28(18):2357-65. doi: 10.1093/bioinformatics/bts448. Epub 2012 Jul 13.

Smoothing waves in array CGH tumor profiles.

Bioinformatics. 2009 May 1;25(9):1099-104. doi: 10.1093/bioinformatics/btp132. Epub 2009 Mar 10.

OncoScape: Exploring the cancer aberration landscape by genomic data fusion.

Sci Rep. 2016 Jun 20;6:28103. doi: 10.1038/srep28103.

Efficient methods for identifying mutated driver pathways in cancer.

Bioinformatics. 2012 Nov 15;28(22):2940-7. doi: 10.1093/bioinformatics/bts564. Epub 2012 Sep 14.

Integrated analysis of copy number alterations and gene expression: a bivariate assessment of equally directed abnormalities.

Bioinformatics. 2009 Dec 15;25(24):3228-35. doi: 10.1093/bioinformatics/btp592. Epub 2009 Oct 14.

Cancer driver gene discovery through an integrative genomics approach in a non-parametric Bayesian framework.

Bioinformatics. 2017 Feb 15;33(4):483-490. doi: 10.1093/bioinformatics/btw662.

CNAmet: an R package for integrating copy number, methylation and expression data.

Bioinformatics. 2011 Mar 15;27(6):887-8. doi: 10.1093/bioinformatics/btr019. Epub 2011 Jan 12.

PLRS: a flexible tool for the joint analysis of DNA copy number and mRNA expression data.

Bioinformatics. 2013 Apr 15;29(8):1081-2. doi: 10.1093/bioinformatics/btt082. Epub 2013 Feb 17.

MACE: mutation-oriented profiling of chemical response and gene expression in cancers.

Bioinformatics. 2015 May 1;31(9):1508-14. doi: 10.1093/bioinformatics/btu835. Epub 2014 Dec 22.

引用本文的文献

Knowledge-Informed Machine Learning for Cancer Diagnosis and Prognosis: A Review.

IEEE Trans Autom Sci Eng. 2025;22:10008-10028. doi: 10.1109/tase.2024.3515839. Epub 2024 Dec 18.

Integrated Workflow for Drug Repurposing in Glioblastoma: Computational Prediction and Preclinical Validation of Therapeutic Candidates.

Brain Sci. 2025 Jun 13;15(6):637. doi: 10.3390/brainsci15060637.

State-of-the-Art Liver Cancer Organoids: Modeling Cancer Stem Cell Heterogeneity for Personalized Treatment.

BioDrugs. 2025 Mar;39(2):237-260. doi: 10.1007/s40259-024-00702-0. Epub 2025 Jan 18.

Comparative evaluation of feature reduction methods for drug response prediction.

Sci Rep. 2024 Dec 28;14(1):30885. doi: 10.1038/s41598-024-81866-1.

Priority-Elastic net for binary disease outcome prediction based on multi-omics data.

BioData Min. 2024 Oct 29;17(1):45. doi: 10.1186/s13040-024-00401-0.

Machine learning-driven exploration of drug therapies for triple-negative breast cancer treatment.

Front Mol Biosci. 2023 Aug 4;10:1215204. doi: 10.3389/fmolb.2023.1215204. eCollection 2023.

Ranking Breast Cancer Drugs and Biomarkers Identification Using Machine Learning and Pharmacogenomics.

ACS Pharmacol Transl Sci. 2023 Feb 24;6(3):399-409. doi: 10.1021/acsptsci.2c00212. eCollection 2023 Mar 10.

Ten quick tips for biomarker discovery and validation analyses using machine learning.

PLoS Comput Biol. 2022 Aug 11;18(8):e1010357. doi: 10.1371/journal.pcbi.1010357. eCollection 2022 Aug.

Functional regulations between genetic alteration-driven genes and drug target genes acting as prognostic biomarkers in breast cancer.

Sci Rep. 2022 Jun 23;12(1):10641. doi: 10.1038/s41598-022-13835-5.

Deep reinforcement learning for personalized treatment recommendation.

Stat Med. 2022 Sep 10;41(20):4034-4056. doi: 10.1002/sim.9491. Epub 2022 Jun 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

串联法：一种基于多种分子数据类型最大化药物反应模型可解释性的两阶段方法。

TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献