Suppr超能文献

W-AlignACE:一种基于从序列以及基因表达/芯片数据中学习到的更精确位置权重矩阵的改进型吉布斯采样算法。

W-AlignACE: an improved Gibbs sampling algorithm based on more accurate position weight matrices learned from sequence and gene expression/ChIP-chip data.

作者信息

Chen Xin, Guo Lingqiong, Fan Zhaocheng, Jiang Tao

机构信息

School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore.

出版信息

Bioinformatics. 2008 May 1;24(9):1121-8. doi: 10.1093/bioinformatics/btn088. Epub 2008 Mar 5.

Abstract

MOTIVATION

Position weight matrices (PWMs) are widely used to depict the DNA binding preferences of transcription factors (TFs) in computational molecular biology and regulatory genomics. Thus, learning an accurate PWM to characterize the binding sites of a specific TF is a fundamental problem that plays an important role in modeling regulatory motifs and also in discovering the regulatory targets of TFs.

RESULTS

We study the question of how to learn a more accurate PWM from both binding sequences and gene expression (or ChIP-chip) data, and propose to find a PWM such that the likelihood of simultaneously observing both binding sequences and their associated gene expression (or ChIP-chip) data is maximised. To solve the above maximum likelihood problem, a sequence weighting scheme is thus introduced based on the observation that binding sites inducing drastic fold changes in mRNA expression (or showing strong binding ratios in ChIP experiments) are likely to represent a true motif. We have incorporated this new learning approach into the popular motif finding program AlignACE. The modified program, called W-AlignACE, is compared with three other programs (AlignACE, MDscan and MotifRegressor) on a variety of datasets, including simulated data, mRNA expression and ChIP-chip data. These tests demonstrate that W-AlignACE is an effective tool for discovering TF binding motifs from gene expression (or ChIP-chip) data and, in particular, has the ability to find very weak motifs like DIG1 and GAL4.

AVAILABILITY

http://www.ntu.edu.sg/home/ChenXin/Gibbs

摘要

动机

在计算分子生物学和调控基因组学中,位置权重矩阵(PWMs)被广泛用于描述转录因子(TFs)的DNA结合偏好。因此,学习一个准确的PWM来表征特定TF的结合位点是一个基本问题,在调控基序建模以及发现TF的调控靶点方面都起着重要作用。

结果

我们研究了如何从结合序列和基因表达(或芯片)数据中学习更准确的PWM这一问题,并提出寻找一个PWM,使得同时观察到结合序列及其相关基因表达(或芯片)数据的可能性最大化。为了解决上述最大似然问题,基于这样的观察引入了一种序列加权方案,即那些能在mRNA表达中引起显著倍数变化(或在芯片实验中显示出强结合率)的结合位点可能代表一个真正的基序。我们已将这种新的学习方法整合到流行的基序查找程序AlignACE中。将这个修改后的程序W-AlignACE与其他三个程序(AlignACE、MDscan和MotifRegressor)在各种数据集上进行比较,包括模拟数据、mRNA表达和芯片数据。这些测试表明,W-AlignACE是从基因表达(或芯片)数据中发现TF结合基序的有效工具,特别是有能力找到像DIG1和GAL4这样非常弱的基序。

可用性

http://www.ntu.edu.sg/home/ChenXin/Gibbs

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验