Suppr超能文献

袋鼠——一个用于生物序列的模式匹配程序。

Kangaroo--a pattern-matching program for biological sequences.

作者信息

Betel Doron, Hogue Christopher W V

机构信息

Department of Biochemistry, University of Toronto, Toronto, Ontario, M5S 1A8, Canada.

出版信息

BMC Bioinformatics. 2002 Jul 31;3:20. doi: 10.1186/1471-2105-3-20.

Abstract

BACKGROUND

Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells.

RESULTS

Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/.

CONCLUSION

A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats.

摘要

背景

生物学家常常希望通过简单的数据库搜索来识别包含明确序列模式的蛋白质或基因。许多数据库并未提供直接或易于使用的查询工具来进行简单搜索,比如识别转录结合位点、蛋白质基序或重复性DNA序列。然而,在很多情况下,简单的模式匹配搜索能够揭示大量信息。我们在本文中展示了一种正则表达式模式匹配工具,该工具用于识别人类编码区域中的短重复性DNA序列,目的是在错配修复缺陷细胞中识别潜在的突变位点。

结果

Kangaroo是一个基于网络的正则表达式模式匹配程序,它可以在十种不同生物体的DNA、蛋白质或编码区域序列中搜索模式。该程序的实现便于进行广泛的查询,对查询表达式的长度或复杂度没有限制。该程序可通过网络访问http://bioinfo.mshri.on.ca/kangaroo/,其源代码可在http://sourceforge.net/projects/slritools/上免费获取。

结论

一个低级别的简单模式匹配应用程序在许多研究场景中可能会被证明是一个有用的工具。例如,Kangaroo被用于识别一种人类结肠直肠癌变体中的潜在遗传靶点,该变体的特征是在包含单核苷酸重复序列的编码区域中具有高频率的突变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f43e/119856/66543e68c809/1471-2105-3-20-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验