Martyanov Viktor, Gross Robert H
Department of Biology, Dartmouth College.
J Vis Exp. 2011 May 31(51):2703. doi: 10.3791/2703.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail.
SCOPE是一种集成基序查找工具,它并行使用三种组件算法,通过过度代表性和基序位置偏好来识别潜在的调控基序。每个组件算法都经过优化,以查找不同类型的基序。通过综合这三种方法的最佳结果,SCOPE的表现优于任何单一算法,即使在存在噪声数据的情况下也是如此。在本文中,我们利用SCOPE的网络版本来研究参与端粒维持的基因。SCOPE已被纳入至少其他两个基序查找程序中,并已用于其他研究。构成SCOPE的三种算法分别是:BEAM,用于查找非简并基序(ACCGGT);PRISM,用于查找简并基序(ASCGWT);SPACER,用于查找更长的二分基序(ACCnnnnnnnnGGT)。这三种算法已针对查找其相应类型的基序进行了优化。它们共同使SCOPE表现得极为出色。一旦分析了一个基因集并识别出候选基序,SCOPE就可以寻找其他包含该基序的基因,将这些基因添加到原始基因集中会提高基序得分。这可以通过过度代表性或基序位置偏好来实现。使用具有经过生物学验证的转录因子结合位点的部分基因集,SCOPE能够识别出也受给定转录因子调控的其余大多数基因。SCOPE的输出以表格和图形化基序图的形式显示候选基序、它们的显著性以及其他信息。SCOPE网站提供常见问题解答和视频教程,该网站还包括一个“示例搜索”按钮,允许用户进行试运行。SCOPE具有非常友好的用户界面,使新手用户无需成为基序查找生物信息学方面的专家就能充分利用该算法的功能。作为输入,SCOPE可以接受基因列表或FASTA序列。这些可以在浏览器文本字段中输入,也可以从文件中读取。SCOPE的输出包含所有已识别基序的列表,以及它们的得分、出现次数、包含该基序的基因比例,以及用于识别该基序的算法。对于每个基序,结果详细信息包括基序的共有表示、序列图谱、位置权重矩阵,以及每个基序出现实例的列表(标明确切位置和“链”)。结果在浏览器窗口中返回,也可以选择通过电子邮件返回。之前的论文详细描述了SCOPE算法。