VOMBAT：使用可变阶贝叶斯树预测转录因子结合位点

VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees.

作者信息

Grau Jan, Ben-Gal Irad, Posch Stefan, Grosse Ivo

机构信息

Institute of Computer Science, University Halle, 06099 Halle, Saale, Germany.

出版信息

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W529-33. doi: 10.1093/nar/gkl212.

DOI:10.1093/nar/gkl212

PMID:16845064

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1538886/

Abstract

Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable order Bayesian trees offering the following functionality: (i) given datasets with annotated binding sites and genomic background sequences, variable order Markov models and variable order Bayesian trees can be trained; (ii) given a set of trained models, putative DNA binding sites can be predicted in a given set of genomic sequences and (iii) given a dataset with annotated binding sites and a dataset with genomic background sequences, cross-validation experiments for different model combinations with different parameter settings can be performed. Several of the offered services are computationally demanding, such as genome-wide predictions of DNA binding sites in mammalian genomes or sets of 10(4)-fold cross-validation experiments for different model combinations based on problem-specific data sets. In order to execute these jobs, and in order to serve multiple users at the same time, the web server is attached to a Linux cluster with 150 processors. VOMBAT is available at http://pdw-24.ipk-gatersleben.de:8080/VOMBAT/.

摘要

可变阶马尔可夫模型和可变阶贝叶斯树已被用于识别转录因子结合位点，并且可以证明它们优于传统模型，如位置权重矩阵、马尔可夫模型和贝叶斯树。我们基于可变阶马尔可夫模型和可变阶贝叶斯树开发了一个用于识别DNA结合位点的网络服务器，它具有以下功能：（i）给定带有注释的结合位点和基因组背景序列的数据集，可以训练可变阶马尔可夫模型和可变阶贝叶斯树；（ii）给定一组训练好的模型，可以在给定的基因组序列集中预测潜在的DNA结合位点；（iii）给定一个带有注释的结合位点的数据集和一个带有基因组背景序列的数据集，可以针对不同参数设置的不同模型组合进行交叉验证实验。所提供的一些服务对计算要求很高，例如在哺乳动物基因组中进行全基因组范围的DNA结合位点预测，或基于特定问题数据集对不同模型组合进行10（4）倍交叉验证实验。为了执行这些任务，并为多个用户同时提供服务，该网络服务器连接到一个拥有150个处理器的Linux集群。VOMBAT可在http://pdw-24.ipk-gatersleben.de:8080/VOMBAT/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a3c/1538886/d6d0a75cc591/gkl212f1.jpg

相似文献

VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees.VOMBAT：使用可变阶贝叶斯树预测转录因子结合位点

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W529-33. doi: 10.1093/nar/gkl212.

Recognition of cis-regulatory elements with vombat.使用袋熊识别顺式调控元件。

J Bioinform Comput Biol. 2007 Apr;5(2B):561-77. doi: 10.1142/s0219720007002886.

Identification of transcription factor binding sites with variable-order Bayesian networks.利用可变阶贝叶斯网络识别转录因子结合位点。

Bioinformatics. 2005 Jun 1;21(11):2657-66. doi: 10.1093/bioinformatics/bti410. Epub 2005 Mar 29.

CREME: Cis-Regulatory Module Explorer for the human genome.CREME：人类基因组的顺式调控模块浏览器。

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W253-6. doi: 10.1093/nar/gkh385.

Stubb: a program for discovery and analysis of cis-regulatory modules.Stubb：一个用于发现和分析顺式调控模块的程序。

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W555-9. doi: 10.1093/nar/gkl224.

SwissRegulon: a database of genome-wide annotations of regulatory sites.瑞士调控子数据库：一个全基因组调控位点注释数据库。

Nucleic Acids Res. 2007 Jan;35(Database issue):D127-31. doi: 10.1093/nar/gkl857. Epub 2006 Nov 27.

PReMod: a database of genome-wide mammalian cis-regulatory module predictions.PReMod：一个全基因组哺乳动物顺式调控模块预测数据库。

Nucleic Acids Res. 2007 Jan;35(Database issue):D122-6. doi: 10.1093/nar/gkl879. Epub 2006 Dec 5.

CONREAL web server: identification and visualization of conserved transcription factor binding sites.CONREAL网络服务器：保守转录因子结合位点的识别与可视化

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W447-50. doi: 10.1093/nar/gki378.

rVISTA 2.0: evolutionary analysis of transcription factor binding sites.rVISTA 2.0：转录因子结合位点的进化分析

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W217-21. doi: 10.1093/nar/gkh383.

Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.使用RSAT扫描基因组序列以寻找转录因子结合位点和顺式调控模块。

Nat Protoc. 2008;3(10):1578-88. doi: 10.1038/nprot.2008.97.

引用本文的文献

The evaluation of transcription factor binding site prediction tools in human and Arabidopsis genomes.人类和拟南芥基因组中转录因子结合位点预测工具的评估

BMC Bioinformatics. 2024 Dec 2;25(1):371. doi: 10.1186/s12859-024-05995-0.

Employees recruitment: A prescriptive analytics approach via machine learning and mathematical programming.员工招聘：一种通过机器学习和数学规划的规范分析方法。

Decis Support Syst. 2020 Jul;134:113290. doi: 10.1016/j.dss.2020.113290. Epub 2020 Apr 3.

Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells.用于解析真核细胞中转录因子结合和转录调控的基因组序列基序分析

Front Genet. 2016 Feb 23;7:24. doi: 10.3389/fgene.2016.00024. eCollection 2016.

NURBS: a database of experimental and predicted nuclear receptor binding sites of mouse.NURBS：一个包含实验和预测的小鼠核受体结合位点的数据库。

Bioinformatics. 2013 Jan 15;29(2):295-7. doi: 10.1093/bioinformatics/bts693. Epub 2012 Nov 29.

A new approach to bias correction in RNA-Seq.一种 RNA-Seq 中偏倚校正的新方法。

Bioinformatics. 2012 Apr 1;28(7):921-8. doi: 10.1093/bioinformatics/bts055. Epub 2012 Jan 28.

本文引用的文献

Identification of transcription factor binding sites with variable-order Bayesian networks.利用可变阶贝叶斯网络识别转录因子结合位点。

Bioinformatics. 2005 Jun 1;21(11):2657-66. doi: 10.1093/bioinformatics/bti410. Epub 2005 Mar 29.

NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.NCBI参考序列（RefSeq）：一个经过整理的基因组、转录本和蛋白质的非冗余序列数据库。

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. doi: 10.1093/nar/gki025.

Splice site identification by idlBNs.通过idlBNs进行剪接位点识别。

Bioinformatics. 2004 Aug 4;20 Suppl 1:i69-76. doi: 10.1093/bioinformatics/bth932.

MATCH: A tool for searching transcription factor binding sites in DNA sequences.MATCH：一种用于在DNA序列中搜索转录因子结合位点的工具。

Nucleic Acids Res. 2003 Jul 1;31(13):3576-9. doi: 10.1093/nar/gkg585.

Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors.转录因子结合位点的核苷酸对转录因子的结合亲和力产生相互依赖的影响。

Nucleic Acids Res. 2002 Mar 1;30(5):1255-61. doi: 10.1093/nar/30.5.1255.

SAMIE: statistical algorithm for modeling interaction energies.SAMIE：用于相互作用能建模的统计算法。

Pac Symp Biocomput. 2001:115-26. doi: 10.1142/9789814447362_0013.

Variations on probabilistic suffix trees: statistical modeling and prediction of protein families.概率后缀树的变体：蛋白质家族的统计建模与预测

Bioinformatics. 2001 Jan;17(1):23-43. doi: 10.1093/bioinformatics/17.1.23.

The TRANSFAC system on gene expression regulation.关于基因表达调控的TRANSFAC系统。

Nucleic Acids Res. 2001 Jan 1;29(1):281-3. doi: 10.1093/nar/29.1.281.

Modeling splice sites with Bayes networks.使用贝叶斯网络对剪接位点进行建模。

Bioinformatics. 2000 Feb;16(2):152-8. doi: 10.1093/bioinformatics/16.2.152.

Eukaryotic promoter recognition.真核生物启动子识别

Genome Res. 1997 Sep;7(9):861-78. doi: 10.1101/gr.7.9.861.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

VOMBAT：使用可变阶贝叶斯树预测转录因子结合位点

VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献