Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland.
Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland.
Bioinformatics. 2017 Jul 15;33(14):i359-i368. doi: 10.1093/bioinformatics/btx266.
A prime challenge in precision cancer medicine is to identify genomic and molecular features that are predictive of drug treatment responses in cancer cells. Although there are several computational models for accurate drug response prediction, these often lack the ability to infer which feature combinations are the most predictive, particularly for high-dimensional molecular datasets. As increasing amounts of diverse genome-wide data sources are becoming available, there is a need to build new computational models that can effectively combine these data sources and identify maximally predictive feature combinations.
We present a novel approach that leverages on systematic integration of data sources to identify response predictive features of multiple drugs. To solve the modeling task we implement a Bayesian linear regression method. To further improve the usefulness of the proposed model, we exploit the known human cancer kinome for identifying biologically relevant feature combinations. In case studies with a synthetic dataset and two publicly available cancer cell line datasets, we demonstrate the improved accuracy of our method compared to the widely used approaches in drug response analysis. As key examples, our model identifies meaningful combinations of features for the well known EGFR, ALK, PLK and PDGFR inhibitors.
The source code of the method is available at https://github.com/suleimank/mvlr .
muhammad.ammad-ud-din@helsinki.fi or suleiman.khan@helsinki.fi.
Supplementary data are available at Bioinformatics online.
精准癌症医学的主要挑战之一是识别基因组和分子特征,这些特征可预测癌细胞对药物治疗的反应。尽管有几个用于准确药物反应预测的计算模型,但这些模型通常缺乏推断哪些特征组合最具预测性的能力,特别是对于高维分子数据集。随着越来越多的不同基因组数据源可用,需要建立新的计算模型,这些模型可以有效地组合这些数据源并识别最具预测性的特征组合。
我们提出了一种新方法,利用系统集成数据源来识别多种药物的反应预测特征。为了解决建模任务,我们实现了贝叶斯线性回归方法。为了进一步提高所提出模型的有用性,我们利用已知的人类癌症激酶组来识别生物学上相关的特征组合。在使用合成数据集和两个公开可用的癌细胞系数据集的案例研究中,我们证明了与药物反应分析中广泛使用的方法相比,我们的方法具有更高的准确性。作为关键示例,我们的模型为众所周知的 EGFR、ALK、PLK 和 PDGFR 抑制剂识别了有意义的特征组合。
该方法的源代码可在 https://github.com/suleimank/mvlr 上获得。
muhammad.ammad-ud-din@helsinki.fi 或 suleiman.khan@helsinki.fi。
补充数据可在生物信息学在线获得。