Suppr超能文献

基于泊松分布的自组织特征映射和层次聚类用于基因表达数据的序列分析

Poisson-based self-organizing feature maps and hierarchical clustering for serial analysis of gene expression data.

作者信息

Wang Haiying, Zheng Huiru, Azuaje Francisco

机构信息

School of Computing and Mathematics, University of Ulster, Jordanstown, Northern Ireland, UK.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):163-75. doi: 10.1109/TCBB.2007.070204.

Abstract

Serial analysis of gene expression (SAGE) is a powerful technique for global gene expression profiling, allowing simultaneous analysis of thousands of transcripts without prior structural and functional knowledge. Pattern discovery and visualization have become fundamental approaches to analyzing such large-scale gene expression data. From the pattern discovery perspective, clustering techniques have received great attention. However, due to the statistical nature of SAGE data (i.e., underlying distribution), traditional clustering techniques may not be suitable for SAGE data analysis. Based on the adaptation and improvement of Self-Organizing Maps and hierarchical clustering techniques, this paper presents two new clustering algorithms, namely, PoissonS and PoissonHC, for SAGE data analysis. Tested on synthetic and experimental SAGE data, these algorithms demonstrate several advantages over traditional pattern discovery techniques. The results indicate that, by incorporating statistical properties of SAGE data, PoissonS and PoissonHC, as well as a hybrid approach (neuro-hierarchical approach) based on the combination of PoissonS and PoissonHC, offer significant improvements in pattern discovery and visualization for SAGE data. Moreover, a user-friendly platform, which may improve and accelerate SAGE data mining, was implemented. The system is freely available on request from the authors for nonprofit use.

摘要

基因表达序列分析(SAGE)是一种用于全局基因表达谱分析的强大技术,它能够在无需事先了解基因结构和功能的情况下,同时对数千个转录本进行分析。模式发现和可视化已成为分析此类大规模基因表达数据的基本方法。从模式发现的角度来看,聚类技术受到了广泛关注。然而,由于SAGE数据的统计特性(即潜在分布),传统的聚类技术可能不适用于SAGE数据分析。基于对自组织映射和层次聚类技术的改进,本文提出了两种用于SAGE数据分析的新聚类算法,即PoissonS和PoissonHC。在合成和实验性SAGE数据上进行测试后,这些算法展现出了相对于传统模式发现技术的若干优势。结果表明,通过纳入SAGE数据的统计特性,PoissonS和PoissonHC以及基于PoissonS和PoissonHC组合的混合方法(神经层次方法),在SAGE数据的模式发现和可视化方面有显著改进。此外,还实现了一个用户友好的平台,该平台可能会改进和加速SAGE数据挖掘。该系统可应作者要求免费提供给非营利性用户使用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验