Suppr超能文献

基于 DNA 微阵列基因表达数据的矩阵分解方法分析的综合评估。

Comprehensive evaluation of matrix factorization methods for the analysis of DNA microarray gene expression data.

机构信息

Seoul National University Biomedical Informatics, Systems Biomedical Informatics Research Center, and Interdisciplinary Program of Medical Informatics Div. of Biomedical Informatics, Seoul National University College of Medicine, Seoul 110799, Korea.

出版信息

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S8. doi: 10.1186/1471-2105-12-S13-S8. Epub 2011 Nov 30.

Abstract

BACKGROUND

Clustering-based methods on gene-expression analysis have been shown to be useful in biomedical applications such as cancer subtype discovery. Among them, Matrix factorization (MF) is advantageous for clustering gene expression patterns from DNA microarray experiments, as it efficiently reduces the dimension of gene expression data. Although several MF methods have been proposed for clustering gene expression patterns, a systematic evaluation has not been reported yet.

RESULTS

Here we evaluated the clustering performance of orthogonal and non-orthogonal MFs by a total of nine measurements for performance in four gene expression datasets and one well-known dataset for clustering. Specifically, we employed a non-orthogonal MF algorithm, BSNMF (Bi-directional Sparse Non-negative Matrix Factorization), that applies bi-directional sparseness constraints superimposed on non-negative constraints, comprising a few dominantly co-expressed genes and samples together. Non-orthogonal MFs tended to show better clustering-quality and prediction-accuracy indices than orthogonal MFs as well as a traditional method, K-means. Moreover, BSNMF showed improved performance in these measurements. Non-orthogonal MFs including BSNMF showed also good performance in the functional enrichment test using Gene Ontology terms and biological pathways.

CONCLUSIONS

In conclusion, the clustering performance of orthogonal and non-orthogonal MFs was appropriately evaluated for clustering microarray data by comprehensive measurements. This study showed that non-orthogonal MFs have better performance than orthogonal MFs and K-means for clustering microarray data.

摘要

背景

基于聚类的基因表达分析方法在癌症亚型发现等生物医学应用中已经显示出了一定的作用。其中,矩阵分解(MF)在聚类 DNA 微阵列实验中的基因表达模式方面具有优势,因为它可以有效地降低基因表达数据的维度。尽管已经提出了几种用于聚类基因表达模式的 MF 方法,但尚未进行系统评估。

结果

我们通过对四个基因表达数据集和一个用于聚类的知名数据集进行的总共九项测量,评估了正交和非正交 MF 的聚类性能。具体来说,我们采用了一种非正交 MF 算法,BSNMF(双向稀疏非负矩阵分解),它应用了双向稀疏约束和非负约束,包含了少数共同表达的基因和样本。非正交 MF 比正交 MF 以及传统方法 K-means 更倾向于表现出更好的聚类质量和预测准确性指标。此外,BSNMF 在这些测量中表现出了更好的性能。非正交 MF,包括 BSNMF,在使用基因本体论术语和生物学途径进行的功能富集测试中也表现出了良好的性能。

结论

总之,通过综合测量,我们适当地评估了正交和非正交 MF 在聚类微阵列数据方面的聚类性能。这项研究表明,非正交 MF 在聚类微阵列数据方面比正交 MF 和 K-means 具有更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验