• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因表达数据中的转移和缩放模式。

Shifting and scaling patterns from gene expression data.

作者信息

Aguilar-Ruiz Jesús S

机构信息

BIGS BioInformatics Group Seville, University of Seville, Pablo de Olavide University, Spain.

出版信息

Bioinformatics. 2005 Oct 15;21(20):3840-5. doi: 10.1093/bioinformatics/bti641. Epub 2005 Sep 6.

DOI:10.1093/bioinformatics/bti641
PMID:16144809
Abstract

MOTIVATION

During the last years, the discovering of biclusters in data is becoming more and more popular. Biclustering aims at extracting a set of clusters, each of which might use a different subset of attributes. Therefore, it is clear that the usefulness of biclustering techniques is beyond the traditional clustering techniques, especially when datasets present high or very high dimensionality. Also, biclustering considers overlapping, which is an interesting aspect, algorithmically and from the point of view of the result interpretation. Since the Cheng and Church's works, the mean squared residue has turned into one of the most popular measures to search for biclusters, which ideally should discover shifting and scaling patterns.

RESULTS

In this work, we identify both types of patterns (shifting and scaling) and demonstrate that the mean squared residue is very useful to search for shifting patterns, but it is not appropriate to find scaling patterns because even when we find a perfect scaling pattern the mean squared residue is not zero. In addition, we provide an interesting result: the mean squared residue is highly dependent on the variance of the scaling factor, which makes possible that any algorithm based on this measure might not find these patterns in data when the variance of gene values is high. The main contribution of this paper is to prove that the mean squared residue is not precise enough from the mathematical point of view in order to discover shifting and scaling patterns at the same time.

CONTACT

aguilar@lsi.us.es.

摘要

动机

在过去几年中,数据中双聚类的发现越来越流行。双聚类旨在提取一组聚类,其中每个聚类可能使用不同的属性子集。因此,很明显双聚类技术的实用性超出了传统聚类技术,特别是当数据集呈现高维或非常高维时。此外,双聚类考虑重叠,这在算法上以及从结果解释的角度来看都是一个有趣的方面。自程和丘奇的工作以来,均方残差已成为搜索双聚类最流行的度量之一,理想情况下它应该发现平移和缩放模式。

结果

在这项工作中,我们识别了这两种类型的模式(平移和缩放),并证明均方残差对于搜索平移模式非常有用,但不适用于寻找缩放模式,因为即使我们找到了完美的缩放模式,均方残差也不为零。此外,我们提供了一个有趣的结果:均方残差高度依赖于缩放因子的方差,这使得基于此度量的任何算法在基因值方差较高时可能无法在数据中找到这些模式。本文的主要贡献是从数学角度证明均方残差不够精确,无法同时发现平移和缩放模式。

联系方式

aguilar@lsi.us.es 。

相似文献

1
Shifting and scaling patterns from gene expression data.基因表达数据中的转移和缩放模式。
Bioinformatics. 2005 Oct 15;21(20):3840-5. doi: 10.1093/bioinformatics/bti641. Epub 2005 Sep 6.
2
Finding multiple coherent biclusters in microarray data using variable string length multiobjective genetic algorithm.使用可变字符串长度多目标遗传算法在微阵列数据中寻找多个相干双聚类
IEEE Trans Inf Technol Biomed. 2009 Nov;13(6):969-75. doi: 10.1109/TITB.2009.2017527. Epub 2009 Mar 16.
3
Identifying projected clusters from gene expression profiles.从基因表达谱中识别预测的聚类。
J Biomed Inform. 2004 Oct;37(5):345-57. doi: 10.1016/j.jbi.2004.05.002.
4
Clustering of change patterns using Fourier coefficients.使用傅里叶系数对变化模式进行聚类。
Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.
5
A new geometric biclustering algorithm based on the Hough transform for analysis of large-scale microarray data.一种基于霍夫变换的新型几何双聚类算法,用于大规模微阵列数据分析。
J Theor Biol. 2008 Mar 21;251(2):264-74. doi: 10.1016/j.jtbi.2007.11.030. Epub 2007 Dec 4.
6
Efficiently mining time-delayed gene expression patterns.高效挖掘时间延迟基因表达模式。
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):400-11. doi: 10.1109/TSMCB.2009.2025564. Epub 2009 Oct 30.
7
An adaptive strategy for single- and multi-cluster gene assignment.一种用于单簇和多簇基因分配的自适应策略。
Biotechnol Prog. 2003 Jul-Aug;19(4):1142-8. doi: 10.1021/bp025648p.
8
An effective measure for assessing the quality of biclusters.评估双聚类质量的有效措施。
Comput Biol Med. 2012 Feb;42(2):245-56. doi: 10.1016/j.compbiomed.2011.11.015. Epub 2011 Dec 21.
9
Graph-based consensus clustering for class discovery from gene expression data.基于图的共识聚类用于从基因表达数据中发现类别
Bioinformatics. 2007 Nov 1;23(21):2888-96. doi: 10.1093/bioinformatics/btm463. Epub 2007 Sep 14.
10
A multi-stage approach to clustering and imputation of gene expression profiles.一种用于基因表达谱聚类和插补的多阶段方法。
Bioinformatics. 2007 Apr 15;23(8):998-1005. doi: 10.1093/bioinformatics/btm053. Epub 2007 Feb 18.

引用本文的文献

1
TransBic: bucket trend-preserving biclustering for finding local and interpretable expression patterns.TransBic:用于发现局部且可解释的表达模式的桶趋势保留双聚类
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf050.
2
Biclustering of Log Data: Insights from a Computer-Based Complex Problem Solving Assessment.日志数据的双聚类分析:基于计算机的复杂问题解决评估的见解
J Intell. 2024 Jan 17;12(1):10. doi: 10.3390/jintelligence12010010.
3
G-bic: generating synthetic benchmarks for biclustering.G-bic:生成用于分群分析的合成基准。
BMC Bioinformatics. 2023 Dec 6;24(1):457. doi: 10.1186/s12859-023-05587-4.
4
Biclustering fMRI time series: a comparative study.基于功能磁共振成像时间序列的双聚类分析:一项对比研究。
BMC Bioinformatics. 2022 May 23;23(1):192. doi: 10.1186/s12859-022-04733-8.
5
An evaluation study of biclusters visualization techniques of gene expression data.基因表达数据的双聚类可视化技术评估研究。
J Integr Bioinform. 2021 Oct 27;18(4):20210019. doi: 10.1515/jib-2021-0019.
6
Identifying Mitochondrial-Related Genes NDUFA10 and NDUFV2 as Prognostic Markers for Prostate Cancer through Biclustering.通过双聚类鉴定与线粒体相关的基因 NDUFA10 和 NDUFV2 作为前列腺癌的预后标志物。
Biomed Res Int. 2021 May 22;2021:5512624. doi: 10.1155/2021/5512624. eCollection 2021.
7
Comparison of sparse biclustering algorithms for gene expression datasets.基因表达数据集的稀疏双聚类算法比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab140.
8
Pairwise gene GO-based measures for biclustering of high-dimensional expression data.基于成对基因GO的高维表达数据双聚类方法
BioData Min. 2018 Mar 27;11:4. doi: 10.1186/s13040-018-0165-9. eCollection 2018.
9
MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections.MCbiclust:一种从海量转录组学数据集中发现大规模功能相关基因集的新算法。
Nucleic Acids Res. 2017 Sep 6;45(15):8712-8730. doi: 10.1093/nar/gkx590.
10
Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering.通过贝叶斯双聚类构建上下文特定和差异基因共表达网络
PLoS Comput Biol. 2016 Jul 28;12(7):e1004791. doi: 10.1371/journal.pcbi.1004791. eCollection 2016 Jul.