Suppr超能文献

一种应用于癌症基因表达谱的分类框架。

A classification framework applied to cancer gene expression profiles.

机构信息

Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.

出版信息

J Healthc Eng. 2013;4(2):255-83. doi: 10.1260/2040-2295.4.2.255.

Abstract

Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF) and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM), bagging, and random forest) on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression) increase the prediction accuracy as compared to using gene expression alone.

摘要

基于基因表达的癌症分类为可能的治疗策略提供了深入的了解。因此,开发能够成功区分癌症亚型或正常与癌症样本的机器学习方法非常重要。本工作讨论了用于癌症分类的监督学习技术。此外,还采用了一种基于属性估计方法(例如 ReliefF)和遗传算法的两步特征选择方法,以找到一组可以最佳区分癌症亚型或正常与癌症样本的基因。不同分类方法(例如决策树、k-最近邻、支持向量机 (SVM)、袋装和随机森林)在 5 个癌症数据集上的应用表明,没有一种分类方法普遍优于所有其他方法。然而,k-最近邻和线性 SVM 通常优于其他分类器,从而提高了分类性能。最后,与仅使用基因表达相比,结合多种类型的基因组数据(例如蛋白质-蛋白质相互作用数据和基因表达数据)可提高预测准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f42/3873740/5bba4ff2bb03/nihms529771f1.jpg

相似文献

引用本文的文献

本文引用的文献

1
Random forests for genomic data analysis.随机森林在基因组数据分析中的应用。
Genomics. 2012 Jun;99(6):323-9. doi: 10.1016/j.ygeno.2012.04.003. Epub 2012 Apr 21.
10
A review of feature selection techniques in bioinformatics.生物信息学中特征选择技术综述。
Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验