Suppr超能文献

基于非参数相关性的基因表达双聚类新方法

A new measure for gene expression biclustering based on non-parametric correlation.

机构信息

Intelligent Systems Group, Department of Computer Sciences and Artificial Intelligence, University of the Basque Country, P.O. Box 649, 20080 Donostia - San Sebastian, Spain.

出版信息

Comput Methods Programs Biomed. 2013 Dec;112(3):367-97. doi: 10.1016/j.cmpb.2013.07.025. Epub 2013 Aug 19.

Abstract

BACKGROUND

One of the emerging techniques for performing the analysis of the DNA microarray data known as biclustering is the search of subsets of genes and conditions which are coherently expressed. These subgroups provide clues about the main biological processes. Until now, different approaches to this problem have been proposed. Most of them use the mean squared residue as quality measure but relevant and interesting patterns can not be detected such as shifting, or scaling patterns. Furthermore, recent papers show that there exist new coherence patterns involved in different kinds of cancer and tumors such as inverse relationships between genes which can not be captured.

RESULTS

The proposed measure is called Spearman's biclustering measure (SBM) which performs an estimation of the quality of a bicluster based on the non-linear correlation among genes and conditions simultaneously. The search of biclusters is performed by using a evolutionary technique called estimation of distribution algorithms which uses the SBM measure as fitness function. This approach has been examined from different points of view by using artificial and real microarrays. The assessment process has involved the use of quality indexes, a set of bicluster patterns of reference including new patterns and a set of statistical tests. It has been also examined the performance using real microarrays and comparing to different algorithmic approaches such as Bimax, CC, OPSM, Plaid and xMotifs.

CONCLUSIONS

SBM shows several advantages such as the ability to recognize more complex coherence patterns such as shifting, scaling and inversion and the capability to selectively marginalize genes and conditions depending on the statistical significance.

摘要

背景

DNA 微阵列数据分析的新兴技术之一是对基因和条件进行一致表达的子群搜索。这些子群提供了关于主要生物过程的线索。到目前为止,已经提出了许多针对这个问题的方法。大多数方法都使用均方残差作为质量度量标准,但无法检测到相关且有趣的模式,例如移位或缩放模式。此外,最近的论文表明,在不同类型的癌症和肿瘤中存在新的一致性模式,例如不能捕获的基因之间的反向关系。

结果

所提出的度量标准称为 Spearman 的双聚类度量(SBM),它根据基因和条件之间的非线性相关性同时对双聚类的质量进行估计。通过使用称为分布估计算法的进化技术来搜索双聚类,该算法使用 SBM 度量作为适应度函数。已经从不同的角度使用人工和真实微阵列对该方法进行了检查。评估过程涉及使用质量指标、一组包括新模式的双聚类模式参考集和一组统计检验。还使用真实微阵列检查了性能,并与不同的算法方法(如 Bimax、CC、OPSM、Plaid 和 xMotifs)进行了比较。

结论

SBM 具有多种优势,例如能够识别更复杂的一致性模式(如移位、缩放和反转)的能力,以及根据统计意义选择性地边缘化基因和条件的能力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验