Suppr超能文献

估算转录因子与DNA的结合能力。

Estimating transcription factor bindability on DNA.

作者信息

Tsunoda T, Takagi T

机构信息

Genome Data Base, Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.

出版信息

Bioinformatics. 1999 Jul-Aug;15(7-8):622-30. doi: 10.1093/bioinformatics/15.7.622.

Abstract

MOTIVATION

Precise analysis of the genetic network, gene function and transcription regulation requires accurate prediction of transcription factor (TF) bindability on DNA. For calculating the matching score between an input sequence and a set of known TF binding sites, we use positional weight matrices (PWMs) and Bucher's calculating method (Bucher, J. Mol. Biol., 212, 563-578, 1990). Since estimating TF binding sites requires cut-off values, we propose a robust cut-off value determining algorithm.

RESULTS

We generalize the concept of local overrepresentation with statistics, and propose a new algorithm for determining the cut-off value using the background rate estimated on non-promoters. The algorithm iteratively determines parameters separating instances into phenomena-dependent and phenomena-independent subsets. Our system includes the method of re-estimating cut-off values of TFs that mis-recognize other TF preferred regions. Our data source comprised 433 non-redundant vertebrate promoters including viral promoters, from Eukaryotic Promoter Database (EPD) R.50. The method is applied to 205 vertebrate TFs that have frequency matrices in TRANSFAC Ver.3. 4 and the cut-off values of all of them can be determined.

AVAILABILITY

The cut-off values and TF binding site predicting tool are available at http://www.hgc.ims.u-tokyo.ac. jp/service/tooldoc/TFBIND. We also provide the cut-off value estimating programs.

摘要

动机

对基因网络、基因功能和转录调控进行精确分析需要准确预测转录因子(TF)与DNA的结合能力。为了计算输入序列与一组已知TF结合位点之间的匹配分数,我们使用位置权重矩阵(PWM)和布赫尔计算方法(布赫尔,《分子生物学杂志》,212卷,563 - 578页,1990年)。由于估计TF结合位点需要截止值,我们提出了一种稳健的截止值确定算法。

结果

我们用统计学方法推广了局部过表达的概念,并提出了一种使用在非启动子上估计的背景率来确定截止值的新算法。该算法迭代确定将实例分为与现象相关和与现象无关子集的参数。我们的系统包括重新估计误识别其他TF偏好区域的TF截止值的方法。我们的数据源包括来自真核生物启动子数据库(EPD)R.50的433个非冗余脊椎动物启动子,包括病毒启动子。该方法应用于TRANSFAC Ver.3. 4中具有频率矩阵的205个脊椎动物TF,并且可以确定它们所有的截止值。

可用性

截止值和TF结合位点预测工具可在http://www.hgc.ims.u-tokyo.ac. jp/service/tooldoc/TFBIND获取。我们还提供截止值估计程序。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验