Neuro-Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
Bioinformatics. 2011 Jan 15;27(2):220-4. doi: 10.1093/bioinformatics/btq628. Epub 2010 Dec 5.
Panels of cell lines such as the NCI-60 have long been used to test drug candidates for their ability to inhibit proliferation. Predictive models of in vitro drug sensitivity have previously been constructed using gene expression signatures generated from gene expression microarrays. These statistical models allow the prediction of drug response for cell lines not in the original NCI-60. We improve on existing techniques by developing a novel multistep algorithm that builds regression models of drug response using Random Forest, an ensemble approach based on classification and regression trees (CART).
This method proved successful in predicting drug response for both a panel of 19 Breast Cancer and 7 Glioma cell lines, outperformed other methods based on differential gene expression, and has general utility for any application that seeks to relate gene expression data to a continuous output variable.
Software was written in the R language and will be available together with associated gene expression and drug response data as the package ivDrug at http://r-forge.r-project.org.
细胞系板,如 NCI-60 长期以来一直被用来测试候选药物抑制增殖的能力。先前已经使用从基因表达微阵列生成的基因表达特征来构建体外药物敏感性的预测模型。这些统计模型允许预测原始 NCI-60 中不存在的细胞系的药物反应。我们通过开发一种新的多步骤算法来改进现有技术,该算法使用随机森林(一种基于分类和回归树(CART)的集成方法)构建药物反应的回归模型。
该方法成功地预测了 19 个乳腺癌和 7 个神经胶质瘤细胞系的药物反应,优于基于差异基因表达的其他方法,并且对于任何试图将基因表达数据与连续输出变量相关联的应用都具有普遍的用途。
软件是用 R 语言编写的,并将与相关的基因表达和药物反应数据一起作为包 ivDrug 在 http://r-forge.r-project.org 上提供。