• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物序列中的组合模式发现:TEIRESIAS算法。

Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm.

作者信息

Rigoutsos I, Floratos A

机构信息

Computational Biology Center, IBM Thomas J. Watson Research Center, York Town Heights, NY 10598, USA.

出版信息

Bioinformatics. 1998;14(1):55-67. doi: 10.1093/bioinformatics/14.1.55.

DOI:10.1093/bioinformatics/14.1.55
PMID:9520502
Abstract

MOTIVATION

The discovery of motifs in biological sequences is an important problem.

RESULTS

This paper presents a new algorithm for the discovery of rigid patterns (motifs) in biological sequences. Our method is combinatorial in nature and able to produce all patterns that appear in at least a (user-defined) minimum number of sequences, yet it manages to be very efficient by avoiding the enumeration of the entire pattern space. Furthermore, the reported patterns are maximal: any reported pattern cannot be made more specific and still keep on appearing at the exact same positions within the input sequences. The effectiveness of the proposed approach is showcased on a number of test cases which aim to: (i) validate the approach through the discovery of previously reported patterns; (ii) demonstrate the capability to identify automatically highly selective patterns particular to the sequences under consideration. Finally, experimental analysis indicates that the algorithm is output sensitive, i.e. its running time is quasi-linear to the size of the generated output.

摘要

动机

在生物序列中发现基序是一个重要问题。

结果

本文提出了一种用于在生物序列中发现刚性模式(基序)的新算法。我们的方法本质上是组合式的,能够生成在至少(用户定义的)最小数量序列中出现的所有模式,并且通过避免枚举整个模式空间设法做到非常高效。此外,所报告的模式是最大的:任何所报告的模式都不能变得更具体,并且仍然在输入序列中的相同位置出现。在一些测试用例上展示了所提出方法的有效性,这些测试用例旨在:(i)通过发现先前报告的模式来验证该方法;(ii)展示识别所考虑序列特有的自动高度选择性模式的能力。最后,实验分析表明该算法对输出敏感,即其运行时间与生成的输出大小近似呈线性关系。

相似文献

1
Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm.生物序列中的组合模式发现:TEIRESIAS算法。
Bioinformatics. 1998;14(1):55-67. doi: 10.1093/bioinformatics/14.1.55.
2
An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences.一种用于挖掘未比对蛋白质序列中频繁模式的高效、通用且可扩展的模式增长方法。
Bioinformatics. 2007 Mar 15;23(6):687-93. doi: 10.1093/bioinformatics/btl665. Epub 2007 Jan 19.
3
Efficient constrained multiple sequence alignment with performance guarantee.具有性能保证的高效约束多序列比对
Proc IEEE Comput Soc Bioinform Conf. 2003;2:337-46.
4
MUSA: a parameter free algorithm for the identification of biologically significant motifs.MUSA:一种用于识别具有生物学意义基序的无参数算法。
Bioinformatics. 2006 Dec 15;22(24):2996-3002. doi: 10.1093/bioinformatics/btl537. Epub 2006 Oct 26.
5
A generic motif discovery algorithm for sequential data.一种用于序列数据的通用基序发现算法。
Bioinformatics. 2006 Jan 1;22(1):21-8. doi: 10.1093/bioinformatics/bti745. Epub 2005 Oct 27.
6
A fast Boyer-Moore type pattern matching algorithm for highly similar sequences.一种用于高度相似序列的快速Boyer-Moore型模式匹配算法。
Int J Data Min Bioinform. 2015;13(3):266-88. doi: 10.1504/ijdmb.2015.072101.
7
AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.AntiClustal:通过反极聚类和线性近似1-中位数计算进行多序列比对。
Proc IEEE Comput Soc Bioinform Conf. 2003;2:326-36.
8
Bases of motifs for generating repeated patterns with wild cards.用于生成带通配符重复模式的基序基础。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Jan-Mar;2(1):40-50. doi: 10.1109/TCBB.2005.5.
9
Finding flexible patterns in unaligned protein sequences.在未比对的蛋白质序列中寻找灵活模式。
Protein Sci. 1995 Aug;4(8):1587-95. doi: 10.1002/pro.5560040817.
10
Mining Contiguous Sequential Generators in Biological Sequences.挖掘生物序列中的连续序列生成器
IEEE/ACM Trans Comput Biol Bioinform. 2016 Sep-Oct;13(5):855-867. doi: 10.1109/TCBB.2015.2495132. Epub 2015 Oct 26.

引用本文的文献

1
SHARK-capture identifies functional motifs in intrinsically disordered protein regions.SHARK-capture可识别内在无序蛋白质区域中的功能基序。
Protein Sci. 2025 Apr;34(4):e70091. doi: 10.1002/pro.70091.
2
The determinants of the rarity of nucleic and peptide short sequences in nature.自然界中核酸和肽短序列稀有性的决定因素。
NAR Genom Bioinform. 2024 Apr 4;6(2):lqae029. doi: 10.1093/nargab/lqae029. eCollection 2024 Jun.
3
Databases and computational methods for the identification of piRNA-related molecules: A survey.用于鉴定piRNA相关分子的数据库和计算方法:一项综述。
Comput Struct Biotechnol J. 2024 Jan 22;23:813-833. doi: 10.1016/j.csbj.2024.01.011. eCollection 2024 Dec.
4
Context-dependent T-cell Receptor Gene Repertoire Profiles in Proliferations of T Large Granular Lymphocytes.T大颗粒淋巴细胞增殖中依赖于背景的T细胞受体基因库谱
Hemasphere. 2023 Jul 17;7(8):e929. doi: 10.1097/HS9.0000000000000929. eCollection 2023 Aug.
5
A Review on Planted (, d) Motif Discovery Algorithms for Medical Diagnose.基于(, d)基序发现算法的医学诊断综述。
Sensors (Basel). 2022 Feb 5;22(3):1204. doi: 10.3390/s22031204.
6
MicroSalmon: A Comprehensive, Searchable Resource of Predicted MicroRNA Targets and 3'UTR Cis-Regulatory Elements in the Full-Length Sequenced Atlantic Salmon Transcriptome.微鲑鱼:大西洋鲑鱼全长测序转录组中预测的微小RNA靶标和3'非翻译区顺式调控元件的综合可搜索资源。
Noncoding RNA. 2021 Sep 22;7(4):61. doi: 10.3390/ncrna7040061.
7
Bioinformatics and Machine Learning Approaches to Understand the Regulation of Mobile Genetic Elements.用于理解可移动遗传元件调控的生物信息学和机器学习方法。
Biology (Basel). 2021 Sep 10;10(9):896. doi: 10.3390/biology10090896.
8
PolarProtPred: predicting apical and basolateral localization of transmembrane proteins using putative short linear motifs and deep learning.PolarProtPred:使用假定的短线性基序和深度学习预测跨膜蛋白的顶部分化和基底外侧定位。
Bioinformatics. 2021 Dec 7;37(23):4328-4335. doi: 10.1093/bioinformatics/btab480.
9
TRB sequences targeting ORF1a/b are associated with disease severity in hospitalized COVID-19 patients.靶向 ORF1a/b 的 TRB 序列与住院 COVID-19 患者的疾病严重程度相关。
J Leukoc Biol. 2022 Jan;111(1):283-289. doi: 10.1002/JLB.6COVCRA1120-762R. Epub 2021 Apr 13.
10
Pyknon-Containing Transcripts Are Downregulated in Colorectal Cancer Tumors, and Loss of Is Associated With Worse Patient Outcome.含Pyknon的转录本在结直肠癌肿瘤中表达下调,且其缺失与患者预后较差相关。
Front Genet. 2020 Nov 12;11:581454. doi: 10.3389/fgene.2020.581454. eCollection 2020.