• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合序列和时间序列表达数据以学习转录模块。

Combining sequence and time series expression data to learn transcriptional modules.

作者信息

Kundaje Anshul, Middendorf Manuel, Gao Feng, Wiggins Chris, Leslie Christina

机构信息

Department of Computer Science, Columbia University, New York 10027, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2005 Jul-Sep;2(3):194-202. doi: 10.1109/TCBB.2005.34.

DOI:10.1109/TCBB.2005.34
PMID:17044183
Abstract

Our goal is to cluster genes into transcriptional modules--sets of genes where similarity in expression is explained by common regulatory mechanisms at the transcriptional level. We want to learn modules from both time series gene expression data and genome-wide motif data that are now readily available for organisms such as S. cereviseae as a result of prior computational studies or experimental results. We present a generative probabilistic model for combining regulatory sequence and time series expression data to cluster genes into coherent transcriptional modules. Starting with a set of motifs representing known or putative regulatory elements (transcription factor binding sites) and the counts of occurrences of these motifs in each gene's promoter region, together with a time series expression profile for each gene, the learning algorithm uses expectation maximization to learn module assignments based on both types of data. We also present a technique based on the Jensen-Shannon entropy contributions of motifs in the learned model for associating the most significant motifs to each module. Thus, the algorithm gives a global approach for associating sets of regulatory elements to "modules" of genes with similar time series expression profiles. The model for expression data exploits our prior belief of smooth dependence on time by using statistical splines and is suitable for typical time course data sets with relatively few experiments. Moreover, the model is sufficiently interpretable that we can understand how both sequence data and expression data contribute to the cluster assignments, and how to interpolate between the two data sources. We present experimental results on the yeast cell cycle to validate our method and find that our combined expression and motif clustering algorithm discovers modules with both coherent expression and similar motif patterns, including binding motifs associated to known cell cycle transcription factors.

摘要

我们的目标是将基因聚类到转录模块中,即一组基因,其表达的相似性由转录水平上的共同调控机制来解释。我们希望从时间序列基因表达数据和全基因组基序数据中学习模块,由于先前的计算研究或实验结果,现在这些数据对于诸如酿酒酵母等生物体来说很容易获得。我们提出了一种生成概率模型,用于结合调控序列和时间序列表达数据,将基因聚类成连贯的转录模块。从一组代表已知或推定调控元件(转录因子结合位点)的基序以及这些基序在每个基因启动子区域中的出现次数开始,再加上每个基因的时间序列表达谱,学习算法使用期望最大化基于这两种数据来学习模块分配。我们还提出了一种基于所学模型中基序的詹森 - 香农熵贡献的技术,用于将最重要的基序与每个模块相关联。因此,该算法提供了一种全局方法,将调控元件集与具有相似时间序列表达谱的基因“模块”相关联。表达数据模型通过使用统计样条利用了我们对时间平滑依赖性的先验信念,适用于实验相对较少的典型时间进程数据集。此外,该模型具有足够的可解释性,我们可以理解序列数据和表达数据如何对聚类分配做出贡献,以及如何在这两个数据源之间进行插值。我们展示了关于酵母细胞周期的实验结果以验证我们的方法,并且发现我们的组合表达和基序聚类算法发现了具有连贯表达和相似基序模式的模块,包括与已知细胞周期转录因子相关的结合基序。

相似文献

1
Combining sequence and time series expression data to learn transcriptional modules.结合序列和时间序列表达数据以学习转录模块。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Jul-Sep;2(3):194-202. doi: 10.1109/TCBB.2005.34.
2
Genome-wide discovery of transcriptional modules from DNA sequence and gene expression.从DNA序列和基因表达中进行全基因组转录模块发现
Bioinformatics. 2003;19 Suppl 1:i273-82. doi: 10.1093/bioinformatics/btg1038.
3
A mixture model with random-effects components for clustering correlated gene-expression profiles.一种具有随机效应成分的混合模型,用于对相关基因表达谱进行聚类。
Bioinformatics. 2006 Jul 15;22(14):1745-52. doi: 10.1093/bioinformatics/btl165. Epub 2006 May 3.
4
Analyzing gene expression time-courses.分析基因表达时间进程。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Jul-Sep;2(3):179-93. doi: 10.1109/TCBB.2005.31.
5
Clustering of change patterns using Fourier coefficients.使用傅里叶系数对变化模式进行聚类。
Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.
6
Predicting genetic regulatory response using classification.使用分类方法预测基因调控反应。
Bioinformatics. 2004 Aug 4;20 Suppl 1:i232-40. doi: 10.1093/bioinformatics/bth923.
7
TimeClust: a clustering tool for gene expression time series.TimeClust:一种用于基因表达时间序列的聚类工具。
Bioinformatics. 2008 Feb 1;24(3):430-2. doi: 10.1093/bioinformatics/btm605. Epub 2007 Dec 6.
8
A fully Bayesian model to cluster gene-expression profiles.一种用于对基因表达谱进行聚类的全贝叶斯模型。
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii130-6. doi: 10.1093/bioinformatics/bti1122.
9
Finding regulatory modules through large-scale gene-expression data analysis.通过大规模基因表达数据分析寻找调控模块。
Bioinformatics. 2005 Apr 1;21(7):1172-9. doi: 10.1093/bioinformatics/bti096. Epub 2004 Oct 28.
10
Associative clustering for exploring dependencies between functional genomics data sets.用于探索功能基因组学数据集之间依赖性的关联聚类
IEEE/ACM Trans Comput Biol Bioinform. 2005 Jul-Sep;2(3):203-16. doi: 10.1109/TCBB.2005.32.

引用本文的文献

1
Modeling regulatory cascades using Artificial Neural Networks: the case of transcriptional regulatory networks shaped during the yeast stress response.使用人工神经网络进行调控级联建模:以酵母应激反应过程中形成的转录调控网络为例。
Front Genet. 2013 Jun 20;4:110. doi: 10.3389/fgene.2013.00110. eCollection 2013.
2
Regulatory Snapshots: integrative mining of regulatory modules from expression time series and regulatory networks.调控快照:从表达时间序列和调控网络中综合挖掘调控模块。
PLoS One. 2012;7(5):e35977. doi: 10.1371/journal.pone.0035977. Epub 2012 May 1.
3
Rapid temporal changes in the expression of a set of neuromodulatory genes during alcohol withdrawal in the dorsal vagal complex: molecular evidence of homeostatic disturbance.
在酒精戒断期间,背侧迷走复合体中一组神经调质基因的表达迅速变化:内稳态紊乱的分子证据。
Alcohol Clin Exp Res. 2012 Oct;36(10):1688-700. doi: 10.1111/j.1530-0277.2012.01791.x. Epub 2012 Apr 6.
4
Patient-specific data fusion defines prognostic cancer subtypes.个体化患者数据融合定义了预后癌症亚型。
PLoS Comput Biol. 2011 Oct;7(10):e1002227. doi: 10.1371/journal.pcbi.1002227. Epub 2011 Oct 20.
5
A new Motzkin class for joint RNA secondary structures.用于联合RNA二级结构的新型Motzkin类。
Bioinformation. 2011 May 7;6(4):162-3. doi: 10.6026/97320630006162.
6
Motif-guided sparse decomposition of gene expression data for regulatory module identification.基于模体的基因表达数据稀疏分解用于调控模块识别。
BMC Bioinformatics. 2011 Mar 22;12:82. doi: 10.1186/1471-2105-12-82.
7
Computational methods for analyzing dynamic regulatory networks.用于分析动态调控网络的计算方法。
Methods Mol Biol. 2010;674:419-41. doi: 10.1007/978-1-60761-854-6_24.
8
Discovering transcriptional modules by Bayesian data integration.基于贝叶斯数据整合的转录模块发现。
Bioinformatics. 2010 Jun 15;26(12):i158-67. doi: 10.1093/bioinformatics/btq210.
9
Reconstructing dynamic regulatory maps.重建动态调控图谱。
Mol Syst Biol. 2007;3:74. doi: 10.1038/msb4100115. Epub 2007 Jan 16.