• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ARCS基序:从未比对的生物序列中发现相关基序。

ARCS-Motif: discovering correlated motifs from unaligned biological sequences.

作者信息

Zhang Shijie, Su Wei, Yang Jiong

机构信息

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA.

出版信息

Bioinformatics. 2009 Jan 15;25(2):183-9. doi: 10.1093/bioinformatics/btn609. Epub 2008 Dec 9.

DOI:10.1093/bioinformatics/btn609
PMID:19073591
Abstract

MOTIVATION

The goal of motif discovery is to detect novel, unknown, and important signals from biology sequences. In most models, the importance of a motif is equal to the sum of the similarity of every single position. In 2006, Song et al. introduced Aggregated Related Column Score (ARCS) measure which includes correlation information to the evaluation of motif importance. The paper showed that the ARCS measure is superior to other measures. Due to the complicated nature of the ARCS motif model, we cannot directly apply existing sequential motif discovery methods to find motifs with high ARCS values.

RESULTS

This article presents a novel mining algorithm, ARCS-Motif, to discover related sequential motifs in biological sequences. ARCS-Motif is applied to 400 PROSITE datasets and compared with five alternative methods (CONSENSUS, Gibbs sampler, MEME, SPLASH and DIALIGN-TX). ARCS-Motif outperforms all the methods in accuracy, and most of the methods in efficiency. Although SPLASH has better efficiency than ARCS-Motif, ARCS-Motif has much better accuracy than SPLASH. On average, ARCS-Motif is able to produce the motifs which are at least 10% better than the best of the alternative methods. Among the 400 PROSITE datasets, ARCS-Motif produces the best motifs for more than 200 families. Other than SPLASH, the execution time of ARCS-Motif is less than a third of that of the fastest alternative method and its execution time grows at the slowest rate with respect to the number of sequences and the average sequence among all methods.

摘要

动机

基序发现的目标是从生物序列中检测新的、未知的和重要的信号。在大多数模型中,基序的重要性等于每个位置相似性的总和。2006年,宋等人引入了聚合相关列得分(ARCS)度量,该度量将相关信息纳入对基序重要性的评估中。该论文表明ARCS度量优于其他度量。由于ARCS基序模型的性质复杂,我们不能直接应用现有的序列基序发现方法来寻找具有高ARCS值的基序。

结果

本文提出了一种新颖的挖掘算法ARCS-Motif,用于在生物序列中发现相关的序列基序。将ARCS-Motif应用于400个PROSITE数据集,并与五种替代方法(CONSENSUS、吉布斯采样器、MEME、SPLASH和DIALIGN-TX)进行比较。ARCS-Motif在准确性方面优于所有方法,在效率方面优于大多数方法。虽然SPLASH的效率比ARCS-Motif高,但ARCS-Motif的准确性比SPLASH好得多。平均而言,ARCS-Motif能够产生比最佳替代方法至少好10%的基序。在400个PROSITE数据集中,ARCS-Motif为超过200个家族产生了最佳基序。除了SPLASH之外,ARCS-Motif的执行时间不到最快替代方法的三分之一,并且相对于序列数量和平均序列而言,其执行时间的增长速度是所有方法中最慢的。

相似文献

1
ARCS-Motif: discovering correlated motifs from unaligned biological sequences.ARCS基序:从未比对的生物序列中发现相关基序。
Bioinformatics. 2009 Jan 15;25(2):183-9. doi: 10.1093/bioinformatics/btn609. Epub 2008 Dec 9.
2
ARCS: an aggregated related column scoring scheme for aligned sequences.ARCS:一种用于比对序列的聚合相关列评分方案。
Bioinformatics. 2006 Oct 1;22(19):2326-32. doi: 10.1093/bioinformatics/btl398. Epub 2006 Jul 26.
3
Relation between weight matrix and substitution matrix: motif search by similarity.权重矩阵与替换矩阵之间的关系:基于相似性的基序搜索。
Bioinformatics. 2005 Apr 1;21(7):938-43. doi: 10.1093/bioinformatics/bti090. Epub 2004 Oct 28.
4
Discovering sequence motifs.发现序列基序。
Methods Mol Biol. 2008;452:231-51. doi: 10.1007/978-1-60327-159-2_12.
5
Motif-based protein ranking by network propagation.基于网络传播的基序蛋白排序
Bioinformatics. 2005 Oct 1;21(19):3711-8. doi: 10.1093/bioinformatics/bti608. Epub 2005 Aug 2.
6
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
7
CompariMotif: quick and easy comparisons of sequence motifs.CompariMotif:序列基序的快速简便比较。
Bioinformatics. 2008 May 15;24(10):1307-9. doi: 10.1093/bioinformatics/btn105. Epub 2008 Mar 28.
8
Localized motif discovery in gene regulatory sequences.基因调控序列中的局部模体发现。
Bioinformatics. 2010 May 1;26(9):1152-9. doi: 10.1093/bioinformatics/btq106. Epub 2010 Mar 11.
9
Discovering novel sequence motifs with MEME.使用MEME发现新的序列基序。
Curr Protoc Bioinformatics. 2002 Nov;Chapter 2:Unit 2.4. doi: 10.1002/0471250953.bi0204s00.
10
Computing the P-value of the information content from an alignment of multiple sequences.根据多条序列比对结果计算信息含量的P值。
Bioinformatics. 2005 Jun;21 Suppl 1:i311-8. doi: 10.1093/bioinformatics/bti1044.

引用本文的文献

1
PePPER: a webserver for prediction of prokaryote promoter elements and regulons.PePPER:一个用于预测原核生物启动子元件和调控子的网络服务器。
BMC Genomics. 2012 Jul 2;13:299. doi: 10.1186/1471-2164-13-299.