• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

寻找DNA序列中的最优简并模式。

Finding optimal degenerate patterns in DNA sequences.

作者信息

Shinozaki Daisuke, Akutsu Tatsuya, Maruyama Osamu

机构信息

Graduate School of Mathematics, Kyushu University, Fukuoka, Japan.

出版信息

Bioinformatics. 2003 Oct;19 Suppl 2:ii206-14. doi: 10.1093/bioinformatics/btg1079.

DOI:10.1093/bioinformatics/btg1079
PMID:14534191
Abstract

MOTIVATION

The problem of finding transcription factor binding sites in the upstream regions of given genes is algorithmically an interesting and challenging problem in computational biology. A degenerate pattern over a finite alphabet Sigma is a sequence of subsets of Sigma. A string over IUPAC nucleic acid codes is also a degenerate pattern over Sigma = {A, C, G, T}, and is used as one of the major patterns modeling transcription factor binding sites in the upstream regions of genes. However, it is known that the problem of finding a degenerate pattern consistent with both positive and negative string sets is in general NP-complete. Our aim is to devise a heuristic algorithm to find a degenerate pattern which is optimal for positive and negative string sets w.r.t. a given score function.

RESULTS

We have proposed an enumerative algorithm called SUPERPOSITION for finding optimal degenerate patterns with a pruning technique, which works with most all reasonable score functions. The performance score of the algorithm has been compared with those of other popular motif-finding algorithms YMF, MEME and AlignACE on various sets of co-regulated genes of yeast. In the computational experiment, SUPERPOSITION has outperformed the others on several gene sets.

AVAILABILITY

The python script SUPERPOSITION is available at http://www.math.kyushu-u.ac.jp/~om/softwares.html

摘要

动机

在给定基因的上游区域中寻找转录因子结合位点的问题,在计算生物学领域,从算法角度来看是一个有趣且具有挑战性的问题。在有限字母表Σ上的一个简并模式是Σ的子集序列。由国际纯粹与应用化学联合会(IUPAC)核酸编码组成的字符串也是Σ = {A, C, G, T}上的一个简并模式,并且被用作对基因上游区域中转录因子结合位点进行建模的主要模式之一。然而,已知寻找与正、负字符串集都一致的简并模式的问题通常是NP完全问题。我们的目标是设计一种启发式算法,以找到相对于给定评分函数而言对正、负字符串集最优的简并模式。

结果

我们提出了一种名为SUPERPOSITION的枚举算法,用于通过剪枝技术找到最优简并模式,该算法适用于几乎所有合理的评分函数。已将该算法的性能得分与其他流行的基序查找算法YMF、MEME和AlignACE在酵母的各种共调控基因集上的性能得分进行了比较。在计算实验中,SUPERPOSITION在几个基因集上的表现优于其他算法。

可用性

可在http://www.math.kyushu-u.ac.jp/~om/softwares.html获取Python脚本SUPERPOSITION

相似文献

1
Finding optimal degenerate patterns in DNA sequences.寻找DNA序列中的最优简并模式。
Bioinformatics. 2003 Oct;19 Suppl 2:ii206-14. doi: 10.1093/bioinformatics/btg1079.
2
On counting position weight matrix matches in a sequence, with application to discriminative motif finding.关于计算序列中的位置权重矩阵匹配及其在判别性基序发现中的应用。
Bioinformatics. 2006 Jul 15;22(14):e454-63. doi: 10.1093/bioinformatics/btl227.
3
Searching for statistically significant regulatory modules.寻找具有统计学意义的调控模块。
Bioinformatics. 2003 Oct;19 Suppl 2:ii16-25. doi: 10.1093/bioinformatics/btg1054.
4
Computational detection of cis -regulatory modules.顺式调控模块的计算检测
Bioinformatics. 2003 Oct;19 Suppl 2:ii5-14. doi: 10.1093/bioinformatics/btg1052.
5
SPACER: identification of cis-regulatory elements with non-contiguous critical residues.间隔序列:具有非连续关键残基的顺式调控元件的鉴定
Bioinformatics. 2007 Apr 15;23(8):1029-31. doi: 10.1093/bioinformatics/btm041.
6
Predicting transcription factor binding sites using local over-representation and comparative genomics.利用局部过表达和比较基因组学预测转录因子结合位点
BMC Bioinformatics. 2006 Aug 31;7:396. doi: 10.1186/1471-2105-7-396.
7
A graph-based approach to systematically reconstruct human transcriptional regulatory modules.一种基于图形的方法来系统地重建人类转录调控模块。
Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.
8
MotifCut: regulatory motifs finding with maximum density subgraphs.MotifCut:通过最大密度子图寻找调控基序
Bioinformatics. 2006 Jul 15;22(14):e150-7. doi: 10.1093/bioinformatics/btl243.
9
IEM: an algorithm for iterative enhancement of motifs using comparative genomics data.IEM:一种利用比较基因组学数据迭代增强基序的算法。
Comput Syst Bioinformatics Conf. 2007;6:227-35.
10
Computational discovery of transcriptional regulatory rules.转录调控规则的计算发现
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii101-7. doi: 10.1093/bioinformatics/bti1117.

引用本文的文献

1
Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. motif 发现和转录因子结合位点在新一代测序时代前后。
Brief Bioinform. 2013 Mar;14(2):225-37. doi: 10.1093/bib/bbs016. Epub 2012 Apr 19.
2
A novel ensemble learning method for de novo computational identification of DNA binding sites.一种用于从头计算识别DNA结合位点的新型集成学习方法。
BMC Bioinformatics. 2007 Jul 12;8:249. doi: 10.1186/1471-2105-8-249.
3
Bounded search for de novo identification of degenerate cis-regulatory elements.
用于从头识别简并顺式调控元件的有界搜索。
BMC Bioinformatics. 2006 May 15;7:254. doi: 10.1186/1471-2105-7-254.