• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用大规模基因表达数据定义转录模块。

Defining transcription modules using large-scale gene expression data.

作者信息

Ihmels Jan, Bergmann Sven, Barkai Naama

机构信息

Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.

出版信息

Bioinformatics. 2004 Sep 1;20(13):1993-2003. doi: 10.1093/bioinformatics/bth166. Epub 2004 Mar 25.

DOI:10.1093/bioinformatics/bth166
PMID:15044247
Abstract

MOTIVATION

Large-scale gene expression data comprising a variety of cellular conditions hold the promise of a global view on the transcription program. While conventional clustering algorithms have been successfully applied to smaller datasets, the utility of many algorithms for the analysis of large-scale data is limited by their inability to capture combinatorial and condition-specific co-regulation. In addition, there is an increasing need to integrate the rapidly accumulating body of other high-throughput biological data with the expression analysis. In a previous work, we introduced the signature algorithm, which overcomes the problems of conventional clustering and allows for intuitive integration of additional biological data. However, this approach is constrained by the comprehensiveness of relevant external data and its lacking ability to capture hierarchical modularity.

METHODS

We present a novel method for the analysis of large-scale expression data, which assigns genes into context-dependent and potentially overlapping regulatory units. We introduce the notion of a transcription module as a self-consistent regulatory unit consisting of a set of co-regulated genes as well as the experimental conditions that induce their co-regulation. Self-consistency is defined by a rigorous mathematical criterion. We propose an efficient algorithm to identify such modules, which is based on the iterative application of the signature algorithm. A threshold parameter that determines the resolution of the modular decomposition is introduced.

RESULTS

The method is applied systematically to over 1000 expression profiles of the yeast Saccharomyces cerevisiae, and the results are presented using two complementary visualization schemes we developed. The average biological coherence, as measured by the conservation of putative cis-regulatory motifs between four related yeast species, is higher for transcription modules than for clusters identified by other methods applied to the same dataset. Our method is related to singular value decomposition (SVD) and to the pairwise average linkage clustering algorithm. It extends SVD by filtering out noise in the expression data and offering variable resolution to reveal hierarchical organization. It furthermore has the advantage over both methods of capturing overlapping modules in the presence of combinatorial regulation.

SUPPLEMENTARY INFORMATION

http://www.weizmann.ac.il/~barkai/modules

摘要

动机

包含各种细胞条件的大规模基因表达数据有望提供转录程序的全局视图。虽然传统聚类算法已成功应用于较小的数据集,但许多算法在分析大规模数据时的效用受到其无法捕捉组合式和条件特异性共调控的限制。此外,将快速积累的其他高通量生物数据与表达分析进行整合的需求日益增加。在之前的一项工作中,我们引入了特征算法,该算法克服了传统聚类的问题,并允许直观地整合额外的生物数据。然而,这种方法受到相关外部数据全面性的限制,并且缺乏捕捉层次模块化的能力。

方法

我们提出了一种分析大规模表达数据的新方法,该方法将基因分配到上下文相关且可能重叠的调控单元中。我们引入了转录模块的概念,将其作为一个自洽的调控单元,由一组共调控基因以及诱导它们共调控的实验条件组成。自洽性由一个严格的数学标准定义。我们提出了一种基于特征算法的迭代应用来识别此类模块的高效算法。引入了一个决定模块化分解分辨率的阈值参数。

结果

该方法被系统地应用于酿酒酵母的1000多个表达谱,并使用我们开发的两种互补可视化方案展示结果。通过四个相关酵母物种之间假定的顺式调控基序的保守性来衡量,转录模块的平均生物学一致性高于应用于同一数据集的其他方法所识别的聚类。我们的方法与奇异值分解(SVD)和成对平均连锁聚类算法相关。它通过滤除表达数据中的噪声并提供可变分辨率以揭示层次组织来扩展SVD。此外,在存在组合调控的情况下,它比这两种方法都具有捕捉重叠模块的优势。

补充信息

http://www.weizmann.ac.il/~barkai/modules

相似文献

1
Defining transcription modules using large-scale gene expression data.利用大规模基因表达数据定义转录模块。
Bioinformatics. 2004 Sep 1;20(13):1993-2003. doi: 10.1093/bioinformatics/bth166. Epub 2004 Mar 25.
2
Mining yeast transcriptional regulatory modules from factor DNA-binding sites and gene expression data.从因子DNA结合位点和基因表达数据中挖掘酵母转录调控模块。
Genome Inform. 2004;15(2):287-95.
3
Integration of known transcription factor binding site information and gene expression data to advance from co-expression to co-regulation.整合已知转录因子结合位点信息与基因表达数据,以从共表达推进到共调控。
Genomics Proteomics Bioinformatics. 2007 May;5(2):86-101. doi: 10.1016/S1672-0229(07)60019-9.
4
Recovering genetic regulatory networks from micro-array data and location analysis data.从微阵列数据和定位分析数据中恢复基因调控网络。
Genome Inform. 2004;15(2):131-40.
5
EDISA: extracting biclusters from multiple time-series of gene expression profiles.EDISA:从多个基因表达谱时间序列中提取双聚类
BMC Bioinformatics. 2007 Sep 12;8:334. doi: 10.1186/1471-2105-8-334.
6
Complementary techniques of clustering and composite pattern analysis to Saccharomyces cerevisiae gene expression.用于酿酒酵母基因表达的聚类和复合模式分析的互补技术。
Appl Bioinformatics. 2003;2(3 Suppl):S37-46.
7
Revealing modular organization in the yeast transcriptional network.揭示酵母转录网络中的模块化组织。
Nat Genet. 2002 Aug;31(4):370-7. doi: 10.1038/ng941. Epub 2002 Jul 22.
8
Background rareness-based iterative multiple sequence alignment algorithm for regulatory element detection.用于调控元件检测的基于稀有性的迭代多序列比对算法
Bioinformatics. 2003 Oct 12;19(15):1952-63. doi: 10.1093/bioinformatics/btg266.
9
Prioritization of gene regulatory interactions from large-scale modules in yeast.酵母大规模模块中基因调控相互作用的优先级排序
BMC Bioinformatics. 2008 Jan 22;9:32. doi: 10.1186/1471-2105-9-32.
10
Motif-guided sparse decomposition of gene expression data for regulatory module identification.基于模体的基因表达数据稀疏分解用于调控模块识别。
BMC Bioinformatics. 2011 Mar 22;12:82. doi: 10.1186/1471-2105-12-82.

引用本文的文献

1
RUBic: rapid unsupervised biclustering.RUBic:快速无监督分块聚类。
BMC Bioinformatics. 2023 Nov 16;24(1):435. doi: 10.1186/s12859-023-05534-3.
2
Non-Coding RNAs Are Brokers in Breast Cancer Interactome Networks and Add Discrimination Power between Subtypes.非编码RNA是乳腺癌相互作用组网络中的中介,并增加了不同亚型之间的鉴别能力。
J Clin Med. 2022 Apr 9;11(8):2103. doi: 10.3390/jcm11082103.
3
Application of Transcriptional Gene Modules to Analysis of ' Gene Expression Data.转录基因模块在基因表达数据分析中的应用
G3 (Bethesda). 2020 Oct 5;10(10):3623-3638. doi: 10.1534/g3.120.401270.
4
Gene Transcription as a Limiting Factor in Protein Production and Cell Growth.基因转录作为蛋白质生产和细胞生长中的一个限制因素。
G3 (Bethesda). 2020 Sep 2;10(9):3229-3242. doi: 10.1534/g3.120.401303.
5
Resolving noise-control conflict by gene duplication.通过基因复制解决噪声控制冲突。
PLoS Biol. 2019 Nov 22;17(11):e3000289. doi: 10.1371/journal.pbio.3000289. eCollection 2019 Nov.
6
The role of structural pleiotropy and regulatory evolution in the retention of heteromers of paralogs.结构多效性和调控进化在保留同源蛋白异聚体中的作用。
Elife. 2019 Aug 27;8:e46754. doi: 10.7554/eLife.46754.
7
MetaQUBIC: a computational pipeline for gene-level functional profiling of metagenome and metatranscriptome.MetaQUBIC:一种用于宏基因组和宏转录组基因水平功能分析的计算流程。
Bioinformatics. 2019 Nov 1;35(21):4474-4477. doi: 10.1093/bioinformatics/btz414.
8
Minor Isozymes Tailor Yeast Metabolism to Carbon Availability.微小同工酶根据碳源可用性调整酵母代谢。
mSystems. 2019 Feb 26;4(1). doi: 10.1128/mSystems.00170-18. eCollection 2019 Jan-Feb.
9
Feedbacks from the metabolic network to the genetic network reveal regulatory modules in E. coli and B. subtilis.代谢网络向遗传网络的反馈揭示了大肠杆菌和枯草芽孢杆菌中的调控模块。
PLoS One. 2018 Oct 4;13(10):e0203311. doi: 10.1371/journal.pone.0203311. eCollection 2018.
10
Discovery of two-level modular organization from matched genomic data via joint matrix tri-factorization.通过联合矩阵三因子分解从匹配的基因组数据中发现两级模块化组织。
Nucleic Acids Res. 2018 Jul 6;46(12):5967-5976. doi: 10.1093/nar/gky440.