• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

袋鼠——一个用于生物序列的模式匹配程序。

Kangaroo--a pattern-matching program for biological sequences.

作者信息

Betel Doron, Hogue Christopher W V

机构信息

Department of Biochemistry, University of Toronto, Toronto, Ontario, M5S 1A8, Canada.

出版信息

BMC Bioinformatics. 2002 Jul 31;3:20. doi: 10.1186/1471-2105-3-20.

DOI:10.1186/1471-2105-3-20
PMID:12150718
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC119856/
Abstract

BACKGROUND

Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells.

RESULTS

Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/.

CONCLUSION

A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats.

摘要

背景

生物学家常常希望通过简单的数据库搜索来识别包含明确序列模式的蛋白质或基因。许多数据库并未提供直接或易于使用的查询工具来进行简单搜索,比如识别转录结合位点、蛋白质基序或重复性DNA序列。然而,在很多情况下,简单的模式匹配搜索能够揭示大量信息。我们在本文中展示了一种正则表达式模式匹配工具,该工具用于识别人类编码区域中的短重复性DNA序列,目的是在错配修复缺陷细胞中识别潜在的突变位点。

结果

Kangaroo是一个基于网络的正则表达式模式匹配程序,它可以在十种不同生物体的DNA、蛋白质或编码区域序列中搜索模式。该程序的实现便于进行广泛的查询,对查询表达式的长度或复杂度没有限制。该程序可通过网络访问http://bioinfo.mshri.on.ca/kangaroo/,其源代码可在http://sourceforge.net/projects/slritools/上免费获取。

结论

一个低级别的简单模式匹配应用程序在许多研究场景中可能会被证明是一个有用的工具。例如,Kangaroo被用于识别一种人类结肠直肠癌变体中的潜在遗传靶点,该变体的特征是在包含单核苷酸重复序列的编码区域中具有高频率的突变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f43e/119856/66543e68c809/1471-2105-3-20-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f43e/119856/66543e68c809/1471-2105-3-20-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f43e/119856/66543e68c809/1471-2105-3-20-1.jpg

相似文献

1
Kangaroo--a pattern-matching program for biological sequences.袋鼠——一个用于生物序列的模式匹配程序。
BMC Bioinformatics. 2002 Jul 31;3:20. doi: 10.1186/1471-2105-3-20.
2
BOV--a web-based BLAST output visualization tool.BOV——一个基于网络的BLAST输出可视化工具。
BMC Genomics. 2008 Sep 15;9:414. doi: 10.1186/1471-2164-9-414.
3
The 3of5 web application for complex and comprehensive pattern matching in protein sequences.用于蛋白质序列中复杂全面模式匹配的3of5网络应用程序。
BMC Bioinformatics. 2006 Mar 16;7:144. doi: 10.1186/1471-2105-7-144.
4
Mutation profiling of mismatch repair-deficient colorectal cncers using an in silico genome scan to identify coding microsatellites.使用计算机基因组扫描对错配修复缺陷型结直肠癌进行突变谱分析以鉴定编码微卫星。
Cancer Res. 2002 Mar 1;62(5):1284-8.
5
GATA: a graphic alignment tool for comparative sequence analysis.GATA:一种用于比较序列分析的图形比对工具。
BMC Bioinformatics. 2005 Jan 17;6:9. doi: 10.1186/1471-2105-6-9.
6
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.transAlign:利用氨基酸促进蛋白质编码DNA序列的多重比对。
BMC Bioinformatics. 2005 Jun 22;6:156. doi: 10.1186/1471-2105-6-156.
7
Systematic identification of genes with coding microsatellites mutated in DNA mismatch repair-deficient cancer cells.在DNA错配修复缺陷癌细胞中对编码微卫星发生突变的基因进行系统鉴定。
Int J Cancer. 2001 Jul 1;93(1):12-9. doi: 10.1002/ijc.1299.
8
A tool for analyzing and annotating genomic sequences.一种用于分析和注释基因组序列的工具。
Genomics. 1997 Nov 15;46(1):37-45. doi: 10.1006/geno.1997.4984.
9
NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae.线虫足部打印机:一种基于网络的软件,用于识别秀丽隐杆线虫和briggsae线虫之间保守的非编码基因组序列区域。
BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S22. doi: 10.1186/1471-2105-6-S4-S22.
10
Gene Slider: sequence logo interactive data-visualization for education and research.基因滑条:用于教育和研究的序列标志交互式数据可视化。
Bioinformatics. 2016 Dec 1;32(23):3670-3672. doi: 10.1093/bioinformatics/btw525. Epub 2016 Aug 13.

引用本文的文献

1
A method for automatically extracting infectious disease-related primers and probes from the literature.一种从文献中自动提取传染病相关引物和探针的方法。
BMC Bioinformatics. 2010 Aug 3;11:410. doi: 10.1186/1471-2105-11-410.
2
A reexamination of information theory-based methods for DNA-binding site identification.基于信息论的DNA结合位点识别方法的重新审视。
BMC Bioinformatics. 2009 Feb 11;10:57. doi: 10.1186/1471-2105-10-57.
3
The 3of5 web application for complex and comprehensive pattern matching in protein sequences.用于蛋白质序列中复杂全面模式匹配的3of5网络应用程序。

本文引用的文献

1
Mutation profiling of mismatch repair-deficient colorectal cncers using an in silico genome scan to identify coding microsatellites.使用计算机基因组扫描对错配修复缺陷型结直肠癌进行突变谱分析以鉴定编码微卫星。
Cancer Res. 2002 Mar 1;62(5):1284-8.
2
Finding nuclear localization signals.寻找核定位信号。
EMBO Rep. 2000 Nov;1(5):411-5. doi: 10.1093/embo-reports/kvd092.
3
PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance.PatSearch:一种模式匹配软件,可在核苷酸和蛋白质序列中找到功能元件并评估其统计学意义。
BMC Bioinformatics. 2006 Mar 16;7:144. doi: 10.1186/1471-2105-7-144.
4
SeqHound: biological sequence and structure database as a platform for bioinformatics research.SeqHound:作为生物信息学研究平台的生物序列与结构数据库
BMC Bioinformatics. 2002 Oct 25;3:32. doi: 10.1186/1471-2105-3-32.
Bioinformatics. 2000 May;16(5):439-50. doi: 10.1093/bioinformatics/16.5.439.
4
A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer.美国国立癌症研究所微卫星不稳定性用于癌症检测和家族易感性研讨会:制定结直肠癌微卫星不稳定性测定的国际标准。
Cancer Res. 1998 Nov 15;58(22):5248-57.
5
Searching for patterns in genomic data.在基因组数据中寻找模式。
Trends Genet. 1997 Dec;13(12):497-8. doi: 10.1016/s0168-9525(97)01347-4.
6
A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server.面向生物学家的新一代信息检索工具:以ExPASy万维网服务器为例。
Trends Biochem Sci. 1994 Jun;19(6):258-60. doi: 10.1016/0968-0004(94)90153-8.
7
Basic local alignment search tool.基本局部比对搜索工具
J Mol Biol. 1990 Oct 5;215(3):403-10. doi: 10.1016/S0022-2836(05)80360-2.
8
WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences.WORDUP:一种用于在DNA序列中发现具有统计学意义模式的高效算法。
Nucleic Acids Res. 1992 Jun 11;20(11):2871-5. doi: 10.1093/nar/20.11.2871.