• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过定位改进次优贪婪迭代分箱启发式算法的性能。

Improving performances of suboptimal greedy iterative biclustering heuristics via localization.

机构信息

Department of Computer Engineering, Kadir Has University, Cibali, Istanbul 34083, Turkey.

出版信息

Bioinformatics. 2010 Oct 15;26(20):2594-600. doi: 10.1093/bioinformatics/btq473. Epub 2010 Aug 23.

DOI:10.1093/bioinformatics/btq473
PMID:20733064
Abstract

MOTIVATION

Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function.

METHODS

We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters.

RESULTS

We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method.

AVAILABILITY

Supplementary material including code implementations in LEDA C++ library, experimental data, and the results are available at http://code.google.com/p/biclustering/

CONTACTS

cesim@khas.edu.tr; melihsozdinler@boun.edu.tr

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

双聚类基因表达数据是从表达值数据矩阵的行和列中提取具有显著相关性的子矩阵的问题。即使是最简单的问题版本在计算上也是困难的。因此,大多数提出的解决方案都采用贪婪迭代启发式方法,这些方法在适当分配的评分函数上进行局部优化。

方法

我们提供了一种快速而简单的预处理算法,称为定位,它对输入数据矩阵的行和列进行重新排序,以便在矩阵的小局部邻域中对相关条目进行分组。所提出的定位算法源自于对具有与双聚类相似结构的问题应用图论方法的有效使用。为了评估本地化预处理算法的有效性,我们专注于三种有代表性的贪婪迭代启发式方法。我们展示了如何将本地化预处理纳入每个代表性算法中,以提高双聚类性能。此外,我们提出了一种简单的双聚类算法,即定位后随机提取(REAL),该算法从本地化预处理的数据矩阵中随机提取子矩阵,消除那些相似度得分较低的子矩阵,并将其余子矩阵作为表示双聚类的相关结构提供。

结果

我们将所提出的本地化预处理与另一种预处理替代方案非负矩阵分解进行了比较。我们表明,我们的快速而简单的定位过程在 H 值测试方面提供了与计算密集型矩阵分解预处理相似甚至更好的结果。接下来,我们证明了当以功能富集和 PPI 验证形式的生物学相关性构成主要性能标准时,三种有代表性的贪婪迭代启发式方法的性能通过本地化预处理得到了提高。基于定位的 REAL 的随机提取方法在相同标准下表现优于代表性的贪婪启发式方法的事实也证实了所提出的预处理方法的有效性。

可用性

补充材料包括在 LEDA C++库中的代码实现、实验数据和结果可在 http://code.google.com/p/biclustering/ 上获得。

联系方式

cesim@khas.edu.tr;melihsozdinler@boun.edu.tr

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
Improving performances of suboptimal greedy iterative biclustering heuristics via localization.通过定位改进次优贪婪迭代分箱启发式算法的性能。
Bioinformatics. 2010 Oct 15;26(20):2594-600. doi: 10.1093/bioinformatics/btq473. Epub 2010 Aug 23.
2
A new geometric biclustering algorithm based on the Hough transform for analysis of large-scale microarray data.一种基于霍夫变换的新型几何双聚类算法,用于大规模微阵列数据分析。
J Theor Biol. 2008 Mar 21;251(2):264-74. doi: 10.1016/j.jtbi.2007.11.030. Epub 2007 Dec 4.
3
Towards clustering of incomplete microarray data without the use of imputation.迈向无需插补的不完整微阵列数据聚类
Bioinformatics. 2007 Jan 1;23(1):107-13. doi: 10.1093/bioinformatics/btl555. Epub 2006 Oct 31.
4
Efficiently mining time-delayed gene expression patterns.高效挖掘时间延迟基因表达模式。
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):400-11. doi: 10.1109/TSMCB.2009.2025564. Epub 2009 Oct 30.
5
Finding multiple coherent biclusters in microarray data using variable string length multiobjective genetic algorithm.使用可变字符串长度多目标遗传算法在微阵列数据中寻找多个相干双聚类
IEEE Trans Inf Technol Biomed. 2009 Nov;13(6):969-75. doi: 10.1109/TITB.2009.2017527. Epub 2009 Mar 16.
6
BARTMAP: a viable structure for biclustering.BARTMAP:一种可行的二聚类结构。
Neural Netw. 2011 Sep;24(7):709-16. doi: 10.1016/j.neunet.2011.03.020. Epub 2011 Apr 13.
7
An iterative data mining approach for mining overlapping coexpression patterns in noisy gene expression data.一种用于在嘈杂基因表达数据中挖掘重叠共表达模式的迭代数据挖掘方法。
IEEE Trans Nanobioscience. 2009 Sep;8(3):252-8. doi: 10.1109/TNB.2009.2026747. Epub 2009 Jul 14.
8
A stable iterative method for refining discriminative gene clusters.一种用于优化鉴别性基因簇的稳定迭代方法。
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.
9
Making the most of microarray data.充分利用微阵列数据。
Nat Genet. 2000 Mar;24(3):204-6. doi: 10.1038/73392.
10
How does gene expression clustering work?基因表达聚类是如何工作的?
Nat Biotechnol. 2005 Dec;23(12):1499-501. doi: 10.1038/nbt1205-1499.

引用本文的文献

1
From data towards knowledge: revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data.从数据到知识:通过统一系统扰动数据的知识挖掘和数据挖掘来揭示信号系统的结构。
PLoS One. 2013 Apr 23;8(4):e61134. doi: 10.1371/journal.pone.0061134. Print 2013.
2
Biclustering for the comprehensive search of correlated gene expression patterns using clustered seed expansion.使用聚类种子扩展进行相关基因表达模式的综合搜索的双聚类
BMC Genomics. 2013 Mar 5;14:144. doi: 10.1186/1471-2164-14-144.