Suppr超能文献

一种用于检测共表达基因上游区域中过度表达基序的吉布斯采样方法。

A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes.

作者信息

Thijs Gert, Marchal Kathleen, Lescot Magali, Rombauts Stephane, De Moor Bart, Rouzé Pierre, Moreau Yves

机构信息

ESAT-SCD, KULeuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium.

出版信息

J Comput Biol. 2002;9(2):447-64. doi: 10.1089/10665270252935566.

Abstract

Microarray experiments can reveal important information about transcriptional regulation. In our case, we look for potential promoter regulatory elements in the upstream region of coexpressed genes. Here we present two modifications of the original Gibbs sampling algorithm for motif finding (Lawrence et al., 1993). First, we introduce the use of a probability distribution to estimate the number of copies of the motif in a sequence. Second, we describe the technical aspects of the incorporation of a higher-order background model whose application we discussed in Thijs et al. (2001). Our implementation is referred to as the Motif Sampler. We successfully validate our algorithm on several data sets. First, we show results for three sets of upstream sequences containing known motifs: 1) the G-box light-response element in plants, 2) elements involved in methionine response in Saccharomyces cerevisiae, and 3) the FNR O(2)-responsive element in bacteria. We use these data sets to explain the influence of the parameters on the performance of our algorithm. Second, we show results for upstream sequences from four clusters of coexpressed genes identified in a microarray experiment on wounding in Arabidopsis thaliana. Several motifs could be matched to regulatory elements from plant defence pathways in our database of plant cis-acting regulatory elements (PlantCARE). Some other strong motifs do not have corresponding motifs in PlantCARE but are promising candidates for further analysis.

摘要

微阵列实验能够揭示有关转录调控的重要信息。就我们的情况而言,我们在共表达基因的上游区域寻找潜在的启动子调控元件。在此,我们提出了对原始吉布斯采样算法(用于基序查找)(劳伦斯等人,1993年)的两种改进。首先,我们引入概率分布的使用来估计序列中基序的拷贝数。其次,我们描述了纳入高阶背景模型的技术细节,我们在蒂伊斯等人(2001年)中讨论了其应用。我们的实现被称为基序采样器。我们在几个数据集上成功验证了我们的算法。首先,我们展示了三组包含已知基序的上游序列的结果:1)植物中的G盒光响应元件,2)酿酒酵母中参与甲硫氨酸响应的元件,以及3)细菌中的FNR氧响应元件。我们使用这些数据集来解释参数对我们算法性能的影响。其次,我们展示了在拟南芥伤口微阵列实验中鉴定出的四个共表达基因簇的上游序列的结果。在我们的植物顺式作用调控元件数据库(PlantCARE)中,几个基序可以与植物防御途径中的调控元件相匹配。其他一些强基序在PlantCARE中没有相应的基序,但有望成为进一步分析的候选对象。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验