Suppr超能文献

为聚类和网络分析筛选基因。

Filtering genes for cluster and network analysis.

作者信息

Tritchler David, Parkhomenko Elena, Beyene Joseph

机构信息

Department of Biostatistics, University of Toronto, Toronto, Ontario, Canada.

出版信息

BMC Bioinformatics. 2009 Jun 23;10:193. doi: 10.1186/1471-2105-10-193.

Abstract

BACKGROUND

Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias.

RESULTS

This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks.

CONCLUSION

The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.

摘要

背景

在进行聚类分析或基因网络分析之前,通常会从待分析的基因集中筛选或去除被认为无关的基因。通常会删除那些在样本间变异小于任意阈值的基因。这可以提高可解释性并减少偏差。

结果

本文引入了用于表示网络结构的模块化模型,以研究不同筛选方法的相对影响。我们表明聚类分析和主成分分析受筛选的影响很大。通过模拟具有已知统计特性的模块化网络,引入并比较了专门用于聚类和网络分析的筛选方法。为了研究更现实的情况,我们基于特征明确的大肠杆菌和酿酒酵母调控网络分析了模拟的“真实”数据。

结论

所介绍的方法非常通用,适用于描述基因表达的任何相似性矩阵。所提出的方法之一SUMCOV在所有模拟模型中表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f484/2708160/a02679183ee1/1471-2105-10-193-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验