自动隐马尔可夫模型-语言模型框架:基于特征选择的方法,通过自动编码器和隐马尔可夫模型预测药物反应

Auto-HMM-LMF: feature selection based method for prediction of drug response via autoencoder and hidden Markov model.

作者信息

Emdadi Akram, Eslahchi Changiz

机构信息

Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.

School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), 193955746, Tehran, Iran.

出版信息

BMC Bioinformatics. 2021 Jan 28;22(1):33. doi: 10.1186/s12859-021-03974-3.

Abstract

BACKGROUND

Predicting the response of cancer cell lines to specific drugs is an essential problem in personalized medicine. Since drug response is closely associated with genomic information in cancer cells, some large panels of several hundred human cancer cell lines are organized with genomic and pharmacogenomic data. Although several methods have been developed to predict the drug response, there are many challenges in achieving accurate predictions. This study proposes a novel feature selection-based method, named Auto-HMM-LMF, to predict cell line-drug associations accurately. Because of the vast dimensions of the feature space for predicting the drug response, Auto-HMM-LMF focuses on the feature selection issue for exploiting a subset of inputs with a significant contribution.

RESULTS

This research introduces a novel method for feature selection of mutation data based on signature assignments and hidden Markov models. Also, we use the autoencoder models for feature selection of gene expression and copy number variation data. After selecting features, the logistic matrix factorization model is applied to predict drug response values. Besides, by comparing to one of the most powerful feature selection methods, the ensemble feature selection method (EFS), we showed that the performance of the predictive model based on selected features introduced in this paper is much better for drug response prediction. Two datasets, the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) are used to indicate the efficiency of the proposed method across unseen patient cell-line. Evaluation of the proposed model showed that Auto-HMM-LMF could improve the accuracy of the results of the state-of-the-art algorithms, and it can find useful features for the logistic matrix factorization method.

CONCLUSIONS

We depicted an application of Auto-HMM-LMF in exploring the new candidate drugs for head and neck cancer that showed the proposed method is useful in drug repositioning and personalized medicine. The source code of Auto-HMM-LMF method is available in https://github.com/emdadi/Auto-HMM-LMF .

摘要

背景

预测癌细胞系对特定药物的反应是个性化医疗中的一个关键问题。由于药物反应与癌细胞中的基因组信息密切相关,因此组织了一些包含数百个人类癌细胞系的大型数据集,并配备了基因组和药物基因组数据。尽管已经开发了多种方法来预测药物反应,但在实现准确预测方面仍存在许多挑战。本研究提出了一种基于特征选择的新方法,名为Auto-HMM-LMF,以准确预测细胞系与药物的关联。由于预测药物反应的特征空间维度巨大,Auto-HMM-LMF专注于特征选择问题,以利用具有显著贡献的输入子集。

结果

本研究引入了一种基于特征分配和隐马尔可夫模型的突变数据特征选择新方法。此外,我们使用自动编码器模型对基因表达和拷贝数变异数据进行特征选择。在选择特征后,应用逻辑矩阵分解模型来预测药物反应值。此外,通过与最强大的特征选择方法之一——集成特征选择方法(EFS)进行比较,我们表明本文中基于所选特征的预测模型在药物反应预测方面的性能要好得多。使用两个数据集,即癌症药物敏感性基因组学(GDSC)和癌细胞系百科全书(CCLE),来表明所提出方法在未见过的患者细胞系中的效率。对所提出模型的评估表明,Auto-HMM-LMF可以提高现有算法结果的准确性,并且它可以为逻辑矩阵分解方法找到有用的特征。

结论

我们描述了Auto-HMM-LMF在探索头颈部癌新候选药物中的应用,这表明所提出的方法在药物重新定位和个性化医疗中是有用的。Auto-HMM-LMF方法的源代码可在https://github.com/emdadi/Auto-HMM-LMF获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0712/7844991/24b3f97484ab/12859_2021_3974_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索