• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RecBic:一种快速准确的保持趋势的双聚类识别算法。

RecBic: a fast and accurate algorithm recognizing trend-preserving biclusters.

机构信息

Research Center for Mathematics and Interdisciplinary Sciences.

School of Mathematics, Shandong University, Jinan 250100, China.

出版信息

Bioinformatics. 2020 Dec 22;36(20):5054-5060. doi: 10.1093/bioinformatics/btaa630.

DOI:10.1093/bioinformatics/btaa630
PMID:32653907
Abstract

MOTIVATION

Biclustering has emerged as a powerful approach to identifying functional patterns in complex biological data. However, existing tools are limited by their accuracy and efficiency to recognize various kinds of complex biclusters submerged in ever large datasets. We introduce a novel fast and highly accurate algorithm RecBic to identify various forms of complex biclusters in gene expression datasets.

RESULTS

We designed RecBic to identify various trend-preserving biclusters, particularly, those with narrow shapes, i.e. clusters where the number of genes is larger than the number of conditions/samples. Given a gene expression matrix, RecBic starts with a column seed, and grows it into a full-sized bicluster by simply repetitively comparing real numbers. When tested on simulated datasets in which the elements of implanted trend-preserving biclusters and those of the background matrix have the same distribution, RecBic was able to identify the implanted biclusters in a nearly perfect manner, outperforming all the compared salient tools in terms of accuracy and robustness to noise and overlaps between the clusters. Moreover, RecBic also showed superiority in identifying functionally related genes in real gene expression datasets.

AVAILABILITY AND IMPLEMENTATION

Code, sample input data and usage instructions are available at the following websites. Code: https://github.com/holyzews/RecBic/tree/master/RecBic/. Data: http://doi.org/10.5281/zenodo.3842717.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

分块聚类已成为识别复杂生物数据中功能模式的强大方法。 然而,现有的工具受到其准确性和效率的限制,无法识别淹没在越来越大数据集中的各种复杂分块。 我们引入了一种新颖的快速且高度准确的算法 RecBic,用于识别基因表达数据集中的各种形式的复杂分块。

结果

我们设计了 RecBic 来识别各种趋势保留的分块,特别是那些形状较窄的分块,即基因数量大于条件/样本数量的分块。 给定一个基因表达矩阵,RecBic 从列种子开始,通过简单地重复比较实数将其生长为完整大小的分块。 在测试中,在所植入的趋势保留分块的元素和背景矩阵的元素具有相同分布的模拟数据集上,RecBic 几乎可以完美地识别植入的分块,在准确性和对噪声以及分块之间的重叠的鲁棒性方面优于所有比较突出的工具。 此外,RecBic 在识别真实基因表达数据集中功能相关的基因方面也表现出优势。

可用性和实现

代码、示例输入数据和使用说明可在以下网站获得。 代码: https://github.com/holyzews/RecBic/tree/master/RecBic/。 数据: http://doi.org/10.5281/zenodo.3842717.

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
RecBic: a fast and accurate algorithm recognizing trend-preserving biclusters.RecBic:一种快速准确的保持趋势的双聚类识别算法。
Bioinformatics. 2020 Dec 22;36(20):5054-5060. doi: 10.1093/bioinformatics/btaa630.
2
ARBic: an all-round biclustering algorithm for analyzing gene expression data.ARBic:一种用于分析基因表达数据的全方位双聚类算法。
NAR Genom Bioinform. 2023 Jan 31;5(1):lqad009. doi: 10.1093/nargab/lqad009. eCollection 2023 Mar.
3
Comparison of sparse biclustering algorithms for gene expression datasets.基因表达数据集的稀疏双聚类算法比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab140.
4
UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data.UniBic:用于基因表达数据分析的基于行的序列双聚类算法。
Sci Rep. 2016 Mar 22;6:23466. doi: 10.1038/srep23466.
5
FABIA: factor analysis for bicluster acquisition.FABIA:双聚类因子分析。
Bioinformatics. 2010 Jun 15;26(12):1520-7. doi: 10.1093/bioinformatics/btq227. Epub 2010 Apr 23.
6
Discovery of error-tolerant biclusters from noisy gene expression data.从嘈杂的基因表达数据中发现容错双聚类。
BMC Bioinformatics. 2011 Nov 24;12 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-12-S12-S1.
7
Bi-correlation clustering algorithm for determining a set of co-regulated genes.双相关聚类算法,用于确定一组共同调节的基因。
Bioinformatics. 2009 Nov 1;25(21):2795-801. doi: 10.1093/bioinformatics/btp526. Epub 2009 Sep 3.
8
QUBIC2: a novel and robust biclustering algorithm for analyses and interpretation of large-scale RNA-Seq data.QUbic2:一种新颖而强大的用于大规模 RNA-Seq 数据分析和解释的双聚类算法。
Bioinformatics. 2020 Feb 15;36(4):1143-1149. doi: 10.1093/bioinformatics/btz692.
9
QUBIC: a bioconductor package for qualitative biclustering analysis of gene co-expression data.QUBiC:一个用于基因共表达数据的定性双聚类分析的 Bioconductor 包。
Bioinformatics. 2017 Feb 1;33(3):450-452. doi: 10.1093/bioinformatics/btw635.
10
BicSPAM: flexible biclustering using sequential patterns.BicSPAM:使用序列模式的灵活双聚类
BMC Bioinformatics. 2014 May 6;15:130. doi: 10.1186/1471-2105-15-130.

引用本文的文献

1
TransBic: bucket trend-preserving biclustering for finding local and interpretable expression patterns.TransBic:用于发现局部且可解释的表达模式的桶趋势保留双聚类
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf050.
2
Online-adjusted evolutionary biclustering algorithm to identify significant modules in gene expression data.用于识别基因表达数据中显著模块的在线调整进化双聚类算法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae681.
3
scQA: A dual-perspective cell type identification model for single cell transcriptome data.
scQA:一种用于单细胞转录组数据的双视角细胞类型识别模型。
Comput Struct Biotechnol J. 2023 Dec 21;23:520-536. doi: 10.1016/j.csbj.2023.12.021. eCollection 2024 Dec.
4
ARBic: an all-round biclustering algorithm for analyzing gene expression data.ARBic:一种用于分析基因表达数据的全方位双聚类算法。
NAR Genom Bioinform. 2023 Jan 31;5(1):lqad009. doi: 10.1093/nargab/lqad009. eCollection 2023 Mar.
5
REW-ISA V2: A Biclustering Method Fusing Homologous Information for Analyzing and Mining Epi-Transcriptome Data.REW-ISA V2:一种融合同源信息用于分析和挖掘表观转录组数据的双聚类方法。
Front Genet. 2021 May 28;12:654820. doi: 10.3389/fgene.2021.654820. eCollection 2021.