Suppr超能文献

一种用于分析基因表达数据的具有迭代优化的新型双聚类方法。

A novel biclustering approach with iterative optimization to analyze gene expression data.

作者信息

Sutheeworapong Sawannee, Ota Motonori, Ohta Hiroyuki, Kinoshita Kengo

机构信息

Department of Biological Sciences, Graduate School of Biosciences and Biotechnology, Tokyo Institute of Technology, Tokyo, Japan ; Graduate School of Information Sciences, Tohoku University, Miyagi, Japan.

出版信息

Adv Appl Bioinform Chem. 2012;5:23-59. doi: 10.2147/AABC.S32622. Epub 2012 Sep 7.

Abstract

OBJECTIVE

With the dramatic increase in microarray data, biclustering has become a promising tool for gene expression analysis. Biclustering has been proven to be superior over clustering in identifying multifunctional genes and searching for co-expressed genes under a few specific conditions; that is, a subgroup of all conditions. Biclustering based on a genetic algorithm (GA) has shown better performance than greedy algorithms, but the overlap state for biclusters must be treated more systematically.

RESULTS

We developed a new biclustering algorithm (binary-iterative genetic algorithm [BIGA]), based on an iterative GA, by introducing a novel, ternary-digit chromosome encoding function. BIGA searches for a set of biclusters by iterative binary divisions that allow the overlap state to be explicitly considered. In addition, the average of the Pearson's correlation coefficient was employed to measure the relationship of genes within a bicluster, instead of the mean square residual, the popular classical index. As compared to the six existing algorithms, BIGA found highly correlated biclusters, with large gene coverage and reasonable gene overlap. The gene ontology (GO) enrichment showed that most of the biclusters are significant, with at least one GO term over represented.

CONCLUSION

BIGA is a powerful tool to analyze large amounts of gene expression data, and will facilitate the elucidation of the underlying functional mechanisms in living organisms.

摘要

目的

随着微阵列数据的急剧增加,双聚类已成为基因表达分析的一种有前景的工具。在识别多功能基因以及在少数特定条件(即所有条件的一个子组)下搜索共表达基因方面,双聚类已被证明优于聚类。基于遗传算法(GA)的双聚类已显示出比贪婪算法更好的性能,但双聚类的重叠状态必须得到更系统的处理。

结果

我们通过引入一种新颖的三进制染色体编码函数,开发了一种基于迭代GA的新双聚类算法(二进制迭代遗传算法 [BIGA])。BIGA通过迭代二进制划分来搜索一组双聚类,这使得重叠状态能够被明确考虑。此外,采用皮尔逊相关系数的平均值来衡量双聚类内基因之间的关系,而不是常用的经典指标均方残差。与现有的六种算法相比,BIGA发现了高度相关的双聚类,具有较大的基因覆盖范围和合理的基因重叠。基因本体(GO)富集表明,大多数双聚类是显著的,至少有一个GO术语过度表达。

结论

BIGA是分析大量基因表达数据的强大工具,将有助于阐明生物体潜在的功能机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04e8/3459542/f6ee836949b3/aabc-5-023f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验