Turatsinze Jean-Valery, Thomas-Chollier Morgane, Defrance Matthieu, van Helden Jacques
Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles CP 263, Campus Plaine, Boulevard du Triomphe, Bruxelles, Belgium.
Nat Protoc. 2008;3(10):1578-88. doi: 10.1038/nprot.2008.97.
This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.
本方案展示了如何使用调控序列分析工具(RSAT)网络服务器(http://rsat.ulb.ac.be/rsat/)检测假定的顺式调控元件以及富含此类元件的区域。该方法适用于已知转录因子,其结合特异性由位置特异性评分矩阵表示,使用程序matrix - scan。已知单个结合位点的检测会返回许多错误预测。然而,通过估计P值以及搜索位点组合(同型和异型模型),结果可以得到显著改善。我们用一个研究案例——黑腹果蝇基因even - skipped的上游序列,来说明位点和富集区域的检测。本方案也在随机对照序列上进行了测试,以评估预测的可靠性。在服务器上,每个任务需要几分钟的计算时间。整个方案大约可在一小时内执行完毕。