Suppr超能文献

基于协同表示的微阵列基因表达数据分类

Collaborative representation-based classification of microarray gene expression data.

作者信息

Shen Lizhen, Jiang Hua, He Mingfang, Liu Guoqing

机构信息

School of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, Nanjing, 211800, China.

School of Physical and Mathematical Sciences, Nanjing Tech University, Nanjing, 211800, China.

出版信息

PLoS One. 2017 Dec 13;12(12):e0189533. doi: 10.1371/journal.pone.0189533. eCollection 2017.

Abstract

Microarray technology is important to simultaneously express multiple genes over a number of time points. Multiple classifier models, such as sparse representation (SR)-based method, have been developed to classify microarray gene expression data. These methods allocate the gene data points to different clusters. In this paper, we propose a novel collaborative representation (CR)-based classification with regularized least square to classify gene data. First, the CR codes a testing sample as a sparse linear combination of all training samples and then classifies the testing sample by evaluating which class leads to the minimum representation error. This CR-based classification approach is remarkably less complex than traditional classification methods but leads to very competitive classification results. In addition, compressive sensing approach is adopted to project the high-dimensional gene expression dataset to a lower-dimensional space which nearly contains the whole information. This compression without loss is beneficial to reduce the computational load. Experiments to detect subtypes of diseases, such as leukemia and autism spectrum disorders, are performed by analyzing the gene expression. The results show that the proposed CR-based algorithm exhibits significantly higher stability and accuracy than the traditional classifiers, such as support vector machine algorithm.

摘要

微阵列技术对于在多个时间点同时表达多个基因非常重要。已经开发了多种分类器模型,例如基于稀疏表示(SR)的方法,用于对微阵列基因表达数据进行分类。这些方法将基因数据点分配到不同的簇中。在本文中,我们提出了一种基于协作表示(CR)的正则化最小二乘分类方法来对基因数据进行分类。首先,CR将测试样本编码为所有训练样本的稀疏线性组合,然后通过评估哪个类别导致最小的表示误差来对测试样本进行分类。这种基于CR的分类方法比传统分类方法的复杂度显著降低,但却能产生极具竞争力的分类结果。此外,采用压缩感知方法将高维基因表达数据集投影到一个几乎包含全部信息的低维空间。这种无损压缩有利于降低计算量。通过分析基因表达来进行检测疾病亚型(如白血病和自闭症谱系障碍)的实验。结果表明,所提出的基于CR的算法比传统分类器(如支持向量机算法)具有显著更高的稳定性和准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f8/5728509/60944614ef8c/pone.0189533.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验