Suppr超能文献

基于可视化的癌症微阵列数据分类分析

Visualization-based cancer microarray data classification analysis.

作者信息

Mramor Minca, Leban Gregor, Demsar Janez, Zupan Blaz

机构信息

Faculty of Computer and Information Science, University of Ljubljana, Trzaska 25, 1000 Ljubljana, Slovenia.

出版信息

Bioinformatics. 2007 Aug 15;23(16):2147-54. doi: 10.1093/bioinformatics/btm312. Epub 2007 Jun 22.

Abstract

MOTIVATION

Methods for analyzing cancer microarray data often face two distinct challenges: the models they infer need to perform well when classifying new tissue samples while at the same time providing an insight into the patterns and gene interactions hidden in the data. State-of-the-art supervised data mining methods often cover well only one of these aspects, motivating the development of methods where predictive models with a solid classification performance would be easily communicated to the domain expert.

RESULTS

Data visualization may provide for an excellent approach to knowledge discovery and analysis of class-labeled data. We have previously developed an approach called VizRank that can score and rank point-based visualizations according to degree of separation of data instances of different class. We here extend VizRank with techniques to uncover outliers, score features (genes) and perform classification, as well as to demonstrate that the proposed approach is well suited for cancer microarray analysis. Using VizRank and radviz visualization on a set of previously published cancer microarray data sets, we were able to find simple, interpretable data projections that include only a small subset of genes yet do clearly differentiate among different cancer types. We also report that our approach to classification through visualization achieves performance that is comparable to state-of-the-art supervised data mining techniques.

AVAILABILITY

VizRank and radviz are implemented as part of the Orange data mining suite (http://www.ailab.si/orange).

SUPPLEMENTARY INFORMATION

Supplementary data are available from http://www.ailab.si/supp/bi-cancer.

摘要

动机

分析癌症微阵列数据的方法通常面临两个不同的挑战:它们推断出的模型在对新的组织样本进行分类时需要表现良好,同时还要深入了解数据中隐藏的模式和基因相互作用。当前最先进的监督数据挖掘方法往往只能很好地涵盖其中一个方面,这促使人们开发出一种方法,使具有可靠分类性能的预测模型能够轻松地与领域专家进行交流。

结果

数据可视化可能为带类标签数据的知识发现和分析提供一种出色的方法。我们之前开发了一种名为VizRank的方法,它可以根据不同类数据实例的分离程度对基于点的可视化进行评分和排序。我们在此对VizRank进行扩展,加入了用于发现异常值、对特征(基因)进行评分和执行分类的技术,并证明所提出的方法非常适合癌症微阵列分析。使用VizRank和radviz可视化方法对一组先前发表的癌症微阵列数据集进行分析,我们能够找到简单、可解释的数据投影,这些投影只包含一小部分基因,但却能清晰地区分不同的癌症类型。我们还报告称,我们通过可视化进行分类的方法所取得的性能与当前最先进的监督数据挖掘技术相当。

可用性

VizRank和radviz作为Orange数据挖掘套件(http://www.ailab.si/orange)的一部分实现。

补充信息

补充数据可从http://www.ailab.si/supp/bi-cancer获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验