• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于图形的方法来系统地重建人类转录调控模块。

A graph-based approach to systematically reconstruct human transcriptional regulatory modules.

作者信息

Yan Xifeng, Mehan Michael R, Huang Yu, Waterman Michael S, Yu Philip S, Zhou Xianghong Jasmine

机构信息

IBM T. J. Watson Research Center, Hawthorne, NY, USA.

出版信息

Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.

DOI:10.1093/bioinformatics/btm227
PMID:17646346
Abstract

MOTIVATION

A major challenge in studying gene regulation is to systematically reconstruct transcription regulatory modules, which are defined as sets of genes that are regulated by a common set of transcription factors. A commonly used approach for transcription module reconstruction is to derive coexpression clusters from a microarray dataset. However, such results often contain false positives because genes from many transcription modules may be simultaneously perturbed upon a given type of conditions. In this study, we propose and validate that genes, which form a coexpression cluster in multiple microarray datasets across diverse conditions, are more likely to form a transcription module. However, identifying genes coexpressed in a subset of many microarray datasets is not a trivial computational problem.

RESULTS

We propose a graph-based data-mining approach to efficiently and systematically identify frequent coexpression clusters. Given m microarray datasets, we model each microarray dataset as a coexpression graph, and search for vertex sets which are frequently densely connected across [theta m] datasets (0 < or = theta < or = 1). For this novel graph-mining problem, we designed two techniques to narrow down the search space: (1) partition the input graphs into (overlapping) groups sharing common properties; (2) summarize the vertex neighbor information from the partitioned datasets onto the 'Neighbor Association Summary Graph's for effective mining. We applied our method to 105 human microarray datasets, and identified a large number of potential transcription modules, activated under different subsets of conditions. Validation by ChIP-chip data demonstrated that the likelihood of a coexpression cluster being a transcription module increases significantly with its recurrence. Our method opens a new way to exploit the vast amount of existing microarray data accumulation for gene regulation study. Furthermore, the algorithm is applicable to other biological networks for approximate network module mining.

AVAILABILITY

http://zhoulab.usc.edu/NeMo/.

摘要

动机

研究基因调控的一个主要挑战是系统地重建转录调控模块,转录调控模块被定义为由一组共同的转录因子调控的基因集合。转录模块重建的一种常用方法是从微阵列数据集中推导共表达聚类。然而,这样的结果往往包含假阳性,因为许多转录模块的基因在给定类型的条件下可能同时受到干扰。在本研究中,我们提出并验证,在不同条件下的多个微阵列数据集中形成共表达聚类的基因更有可能形成一个转录模块。然而,识别在许多微阵列数据集的子集中共表达的基因并非一个简单的计算问题。

结果

我们提出一种基于图的数据挖掘方法,以高效且系统地识别频繁共表达聚类。给定m个微阵列数据集,我们将每个微阵列数据集建模为一个共表达图,并搜索在[θm]个数据集(0≤θ≤1)中频繁紧密连接的顶点集。针对这个新颖的图挖掘问题,我们设计了两种技术来缩小搜索空间:(1)将输入图划分为具有共同属性的(重叠)组;(2)将来自划分后数据集的顶点邻居信息汇总到“邻居关联汇总图”上以进行有效挖掘。我们将我们的方法应用于105个人类微阵列数据集,并识别出大量在不同条件子集下被激活的潜在转录模块。通过芯片-芯片数据验证表明,共表达聚类作为转录模块的可能性随着其重现性而显著增加。我们的方法为利用大量现有的微阵列数据积累进行基因调控研究开辟了一条新途径。此外,该算法适用于其他生物网络以进行近似网络模块挖掘。

可用性

http://zhoulab.usc.edu/NeMo/

相似文献

1
A graph-based approach to systematically reconstruct human transcriptional regulatory modules.一种基于图形的方法来系统地重建人类转录调控模块。
Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.
2
Systematic discovery of functional modules and context-specific functional annotation of human genome.人类基因组功能模块的系统发现及特定背景下的功能注释
Bioinformatics. 2007 Jul 1;23(13):i222-9. doi: 10.1093/bioinformatics/btm222.
3
Predicting genetic regulatory response using classification.使用分类方法预测基因调控反应。
Bioinformatics. 2004 Aug 4;20 Suppl 1:i232-40. doi: 10.1093/bioinformatics/bth923.
4
Mining coherent dense subgraphs across massive biological networks for functional discovery.在海量生物网络中挖掘连贯密集子图以进行功能发现。
Bioinformatics. 2005 Jun;21 Suppl 1:i213-21. doi: 10.1093/bioinformatics/bti1049.
5
Genome-wide prediction of transcriptional regulatory elements of human promoters using gene expression and promoter analysis data.利用基因表达和启动子分析数据对人类启动子的转录调控元件进行全基因组预测。
BMC Bioinformatics. 2006 Jul 4;7:330. doi: 10.1186/1471-2105-7-330.
6
Regulatory motif finding by logic regression.通过逻辑回归进行调控基序发现。
Bioinformatics. 2004 Nov 1;20(16):2799-811. doi: 10.1093/bioinformatics/bth333. Epub 2004 May 27.
7
Computational discovery of transcriptional regulatory rules.转录调控规则的计算发现
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii101-7. doi: 10.1093/bioinformatics/bti1117.
8
Predicting transcription factor binding sites using local over-representation and comparative genomics.利用局部过表达和比较基因组学预测转录因子结合位点
BMC Bioinformatics. 2006 Aug 31;7:396. doi: 10.1186/1471-2105-7-396.
9
Computational identification of combinatorial regulation and transcription factor binding sites.组合调控和转录因子结合位点的计算识别
Biotechnol Bioeng. 2007 Aug 15;97(6):1594-602. doi: 10.1002/bit.21354.
10
MotifCut: regulatory motifs finding with maximum density subgraphs.MotifCut:通过最大密度子图寻找调控基序
Bioinformatics. 2006 Jul 15;22(14):e150-7. doi: 10.1093/bioinformatics/btl243.

引用本文的文献

1
Deciphering RNA Regulatory Elements Involved in the Developmental and Environmental Gene Regulation of Trypanosoma brucei.解析布氏锥虫发育和环境基因调控中涉及的RNA调控元件
PLoS One. 2015 Nov 3;10(11):e0142342. doi: 10.1371/journal.pone.0142342. eCollection 2015.
2
AIM: a comprehensive Arabidopsis interactome module database and related interologs in plants.目的:一个全面的拟南芥相互作用组模块数据库及植物中的相关种间同源基因。
Database (Oxford). 2014 Dec 4;2014:bau117. doi: 10.1093/database/bau117. Print 2014.
3
Integrative analysis of many RNA-seq datasets to study alternative splicing.
整合多个RNA测序数据集以研究可变剪接
Methods. 2014 Jun 1;67(3):313-24. doi: 10.1016/j.ymeth.2014.02.024. Epub 2014 Feb 28.
4
A scalable method for discovering significant subnetworks.一种用于发现重要子网的可扩展方法。
BMC Syst Biol. 2013;7 Suppl 4(Suppl 4):S3. doi: 10.1186/1752-0509-7-S4-S3. Epub 2013 Oct 23.
5
Constructing higher-order miRNA-mRNA interaction networks in prostate cancer via hypergraph-based learning.通过基于超图的学习构建前列腺癌中的高阶miRNA-mRNA相互作用网络。
BMC Syst Biol. 2013 Jun 19;7:47. doi: 10.1186/1752-0509-7-47.
6
Genomic positions of co-expressed genes: echoes of chromosome organisation in gene expression data.共表达基因的基因组位置:基因表达数据中染色体组织的回声
BMC Res Notes. 2013 Jun 13;6:229. doi: 10.1186/1756-0500-6-229.
7
Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review.分子网络的结构与动态:药物发现的新范例:全面综述。
Pharmacol Ther. 2013 Jun;138(3):333-408. doi: 10.1016/j.pharmthera.2013.01.016. Epub 2013 Feb 4.
8
Integrating many co-splicing networks to reconstruct splicing regulatory modules.整合多个共剪接网络以重建剪接调控模块。
BMC Syst Biol. 2012;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1752-0509-6-S1-S17. Epub 2012 Jul 16.
9
Algorithm to identify frequent coupled modules from two-layered network series: application to study transcription and splicing coupling.从两层网络序列中识别频繁耦合模块的算法:应用于研究转录与剪接耦合
J Comput Biol. 2012 Jun;19(6):710-30. doi: 10.1089/cmb.2012.0025.
10
An overlapping module identification method in protein-protein interaction networks.蛋白质-蛋白质相互作用网络中的重叠模块识别方法。
BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S4. doi: 10.1186/1471-2105-13-S7-S4.