Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou Str., Athens GR-11527, Greece; Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Str, Athens GR-11527, Greece.
Department of Pathology, NYU School of Medicine, New York, NY 10016, USA; Laura and Isaac Perlmutter Cancer Center, NYU School of Medicine, New York, NY 10016, USA.
Pharmacol Ther. 2019 Nov;203:107395. doi: 10.1016/j.pharmthera.2019.107395. Epub 2019 Jul 30.
A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.
癌症治疗的一个主要挑战是在个体化基础上预测抗癌药物的临床反应。这项任务的成功在很大程度上取决于开发能够将大型“组学”数据整合到有效药物反应模型中的计算资源的能力。机器学习是一个不断扩展和发展的计算领域,有望满足这些需求。在这里,我们重点介绍:1)专门用于药物反应预测应用的各种有监督和无监督算法,2)将这些算法开发成适用模型所采用的策略,3)输入这些框架的数据资源,以及 4)最大限度提高模型性能的陷阱和挑战。在这种情况下,我们还描述了一种基于关联规则挖掘的新的计算机筛选过程,用于识别候选药物反应驱动基因,并将其与相关的数据挖掘框架进行比较,我们为此生成了一个免费的网络应用程序,可在:https://compbio.nyumc.org/drugs/。该流水线高效地探索了大型样本空间,同时能够在多维空间中检测低频事件并评估统计显著性,以易于解释的规则形式呈现结果。我们最后对基于机器学习的药物反应预测在精准医学中的应用的未来前景和挑战进行了总结。