Université Paris-Sud 11, Centre National de la Recherche Scientifique, UMR 8621, Institut de Génétique et Microbiologie, Orsay, France.
BMC Bioinformatics. 2010 Nov 11;11:554. doi: 10.1186/1471-2105-11-554.
The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been described to discriminate against false-positives in the prediction of new binding targets; however none of them has been implemented so far to assist the detection of binding sites at the genomic scale.
FITBAR (Fast Investigation Tool for Bacterial and Archaeal Regulons) is a web service designed to identify new protein binding sites on fully sequenced prokaryotic genomes. This tool consists in a workbench where the significance of the predictions can be compared using different statistical methods, a feature not found in existing resources. The Local Markov Model and the Compound Importance Sampling algorithms have been implemented to compute the P-value of newly discovered binding sites. In addition, FITBAR provides two optimized genomic scanning algorithms using either log-odds or entropy-weighted position-specific scoring matrices. Other significant features include the production of a detailed genomic context map for each detected binding site and the export of the search results in spreadsheet and portable document formats. FITBAR discovery of a high affinity Escherichia coli NagC binding site was validated experimentally in vitro as well as in vivo and published.
FITBAR was developed in order to allow fast, accurate and statistically robust predictions of prokaryotic regulons. This feature constitutes the main advantage of this web tool over other matrix search programs and does not impair its performance. The web service is available at http://archaea.u-psud.fr/fitbar.
调节蛋白与特定 DNA 靶标的结合决定了邻近基因的准确表达。在完全测序的基因组中预测新的结合位点是深入了解基因调控网络的关键方面。已经描述了几种算法来区分新结合靶点预测中的假阳性;然而,到目前为止,还没有一种算法被实施来协助在基因组范围内检测结合位点。
FITBAR(细菌和古菌调控子快速研究工具)是一个设计用于识别完全测序原核基因组上新的蛋白质结合位点的网络服务。该工具是一个工作台,其中可以使用不同的统计方法比较预测的显著性,这是现有资源中没有的功能。实现了局部马尔可夫模型和复合重要性抽样算法来计算新发现的结合位点的 P 值。此外,FITBAR 提供了两种使用对数几率或熵加权位置特异性评分矩阵的优化基因组扫描算法。其他重要功能包括为每个检测到的结合位点生成详细的基因组上下文映射,并以电子表格和可移植文档格式导出搜索结果。FITBAR 在大肠杆菌 NagC 高亲和力结合位点的发现已在体外和体内进行了实验验证,并已发表。
开发 FITBAR 是为了允许快速、准确和具有统计学稳健性的原核调控子预测。与其他矩阵搜索程序相比,这一特性是该网络工具的主要优势,并且不会影响其性能。该网络服务可在 http://archaea.u-psud.fr/fitbar 上获得。