• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

双聚类数据分析:全面综述。

Biclustering data analysis: a comprehensive survey.

机构信息

LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal.

出版信息

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae342.

DOI:10.1093/bib/bbae342
PMID:39007596
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11247412/
Abstract

Biclustering, the simultaneous clustering of rows and columns of a data matrix, has proved its effectiveness in bioinformatics due to its capacity to produce local instead of global models, evolving from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. We unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. We further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. We highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output. Moreover, we discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. We also relate biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.

摘要

双聚类(同时对数据矩阵的行和列进行聚类)由于能够生成局部模型而非全局模型,已被证明在生物信息学中非常有效。它已从基因表达数据分析中的关键技术演变为发现模式和识别生物模块的最常用方法之一,在描述性和预测性学习任务中都得到了广泛应用。本综述全面介绍了双聚类。它提出了一个基本组件(双聚类、双聚类解决方案、双聚类算法和评估指标)及其应用的更新分类法。我们使用新的定义统一了文献中分散的概念,以适应不同类型的数据(如表格、网络和时间序列数据)以及生物和生物医学数据领域的特殊性。我们进一步提出了一个双聚类数据分析的流程,并讨论了在实际应用中整合双聚类的实际方面。我们强调了突出的应用领域,特别是在生物信息学中,并确定了典型的双聚类以说明分析结果。此外,我们讨论了在选择、应用和评估双聚类算法时需要考虑的重要方面。我们还将双聚类与其他数据挖掘任务(聚类、模式挖掘、分类、三聚类、N -way 聚类和图挖掘)联系起来。因此,它为双聚类数据分析提供了理论和实践指导,展示了其从复杂数据集发现可操作见解的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/39466fec5f8a/bbae342f17.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/11b72798fe49/bbae342f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/96500c60a6c3/bbae342f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/38c46b46af46/bbae342f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/95d38ebb1668/bbae342f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/1ce9e48797d9/bbae342f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/4f6c637fcf7d/bbae342f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/8e7038d489e3/bbae342f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/3f6a16abe791/bbae342f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/14da2cea8fb3/bbae342f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/bb9911dd9e72/bbae342f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/c9afe8986ab9/bbae342f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/b860546aa68b/bbae342f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/d174d71371dd/bbae342f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/e89048c6208b/bbae342f14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/fb398a505da8/bbae342f15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/8c0b317004a7/bbae342f16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/39466fec5f8a/bbae342f17.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/11b72798fe49/bbae342f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/96500c60a6c3/bbae342f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/38c46b46af46/bbae342f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/95d38ebb1668/bbae342f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/1ce9e48797d9/bbae342f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/4f6c637fcf7d/bbae342f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/8e7038d489e3/bbae342f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/3f6a16abe791/bbae342f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/14da2cea8fb3/bbae342f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/bb9911dd9e72/bbae342f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/c9afe8986ab9/bbae342f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/b860546aa68b/bbae342f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/d174d71371dd/bbae342f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/e89048c6208b/bbae342f14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/fb398a505da8/bbae342f15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/8c0b317004a7/bbae342f16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ced/11247412/39466fec5f8a/bbae342f17.jpg

相似文献

1
Biclustering data analysis: a comprehensive survey.双聚类数据分析:全面综述。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae342.
2
Biclustering algorithms for biological data analysis: a survey.用于生物数据分析的双聚类算法:一项综述。
IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):24-45. doi: 10.1109/TCBB.2004.2.
3
It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data.是时候应用双聚类了:对生物和生物医学数据中双聚类应用的全面综述。
Brief Bioinform. 2019 Jul 19;20(4):1449-1464. doi: 10.1093/bib/bby014.
4
Identification of bicluster regions in a binary matrix and its applications.二值矩阵中双聚类区域的识别及其应用。
PLoS One. 2013 Aug 5;8(8):e71680. doi: 10.1371/journal.pone.0071680. Print 2013.
5
COSCEB: Comprehensive search for column-coherent evolution biclusters and its application to hub gene identification.COSCEB:列一致进化双聚类的全面搜索及其在枢纽基因识别中的应用。
J Biosci. 2019 Jun;44(2).
6
Discovering biclusters in gene expression data based on high-dimensional linear geometries.基于高维线性几何在基因表达数据中发现双簇。
BMC Bioinformatics. 2008 Apr 23;9:209. doi: 10.1186/1471-2105-9-209.
7
Biclustering fMRI time series: a comparative study.基于功能磁共振成像时间序列的双聚类分析:一项对比研究。
BMC Bioinformatics. 2022 May 23;23(1):192. doi: 10.1186/s12859-022-04733-8.
8
A systematic comparative evaluation of biclustering techniques.双聚类技术的系统比较评估
BMC Bioinformatics. 2017 Jan 23;18(1):55. doi: 10.1186/s12859-017-1487-1.
9
Row and Column Structure-Based Biclustering for Gene Expression Data.基于行列结构的基因表达数据的双聚类分析。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):1117-1129. doi: 10.1109/TCBB.2020.3022085. Epub 2022 Apr 1.
10
Discovery of error-tolerant biclusters from noisy gene expression data.从嘈杂的基因表达数据中发现容错双聚类。
BMC Bioinformatics. 2011 Nov 24;12 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-12-S12-S1.

引用本文的文献

1
Product Centred Dirichlet Processes for Bayesian Multiview Clustering.用于贝叶斯多视图聚类的以产品为中心的狄利克雷过程
J R Stat Soc Series B Stat Methodol. 2025 Apr 30. doi: 10.1093/jrsssb/qkaf021.
2
Using Biclustering to Detect Cheating in Real Time on Mixed-Format Tests.使用双聚类实时检测混合格式测试中的作弊行为。
Educ Psychol Meas. 2025 May 24:00131644251333143. doi: 10.1177/00131644251333143.
3
A personalized reinforcement learning recommendation algorithm using bi-clustering techniques.一种使用双聚类技术的个性化强化学习推荐算法。

本文引用的文献

1
G-bic: generating synthetic benchmarks for biclustering.G-bic:生成用于分群分析的合成基准。
BMC Bioinformatics. 2023 Dec 6;24(1):457. doi: 10.1186/s12859-023-05587-4.
2
Province clustering based on the percentage of communicable disease using the BCBimax biclustering algorithm.基于传染病百分比的省份聚类,使用 BCBimax 双聚类算法。
Geospat Health. 2023 Sep 12;18(2). doi: 10.4081/gh.2023.1202.
3
On the challenges of predicting treatment response in Hodgkin's Lymphoma using transcriptomic data.基于转录组数据预测霍奇金淋巴瘤治疗反应的挑战。
PLoS One. 2025 Feb 20;20(2):e0315533. doi: 10.1371/journal.pone.0315533. eCollection 2025.
4
TransBic: bucket trend-preserving biclustering for finding local and interpretable expression patterns.TransBic:用于发现局部且可解释的表达模式的桶趋势保留双聚类
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf050.
BMC Med Genomics. 2023 Jul 20;16(Suppl 1):170. doi: 10.1186/s12920-023-01508-9.
4
Cancer-specific functional profiling in microsatellite-unstable (MSI) colon and endometrial cancers using combined differentially expressed genes and biclustering analysis.使用联合差异表达基因和双聚类分析对微卫星不稳定(MSI)结肠癌和子宫内膜癌进行癌症特异性功能分析。
Medicine (Baltimore). 2023 May 12;102(19):e33647. doi: 10.1097/MD.0000000000033647.
5
Triclustering-based classification of longitudinal data for prognostic prediction: targeting relevant clinical endpoints in amyotrophic lateral sclerosis.基于三聚类的纵向数据分析分类用于预后预测:以肌萎缩侧索硬化症的相关临床终点为目标。
Sci Rep. 2023 Apr 15;13(1):6182. doi: 10.1038/s41598-023-33223-x.
6
Crop phenotype prediction using biclustering to explain genotype-by-environment interactions.利用双聚类解释基因型与环境互作进行作物表型预测。
Front Plant Sci. 2022 Sep 20;13:975976. doi: 10.3389/fpls.2022.975976. eCollection 2022.
7
Learning prognostic models using a mixture of biclustering and triclustering: Predicting the need for non-invasive ventilation in Amyotrophic Lateral Sclerosis.使用混合二聚类和三聚类学习预后模型:预测肌萎缩侧索硬化症患者对无创通气的需求。
J Biomed Inform. 2022 Oct;134:104172. doi: 10.1016/j.jbi.2022.104172. Epub 2022 Aug 30.
8
Biclustering fMRI time series: a comparative study.基于功能磁共振成像时间序列的双聚类分析:一项对比研究。
BMC Bioinformatics. 2022 May 23;23(1):192. doi: 10.1186/s12859-022-04733-8.
9
Codon usage patterns across seven Rosales species.七个蔷薇目物种的密码子使用模式。
BMC Plant Biol. 2022 Feb 5;22(1):65. doi: 10.1186/s12870-022-03450-x.
10
Shared sets of correlated polygenic risk scores and voxel-wise grey matter across multiple traits identified via bi-clustering.通过双聚类在多个性状中鉴定出共享的多基因风险评分和体素级灰质相关集。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2201-2206. doi: 10.1109/EMBC46164.2021.9630825.