• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于核酸序列模式匹配分析的搜索算法。

Search algorithm for pattern match analysis of nucleic acid sequences.

作者信息

Harr R, Häggström M, Gustafsson P

出版信息

Nucleic Acids Res. 1983 May 11;11(9):2943-57. doi: 10.1093/nar/11.9.2943.

DOI:10.1093/nar/11.9.2943
PMID:6344023
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC325935/
Abstract

A new type of search algorithm to find biological information inherited in nucleic acid sequences was developed. The algorithm is of pattern match type and is based on the fact that genetic information often is a function of a predictable statistical occurrence of the four bases within parts of the sequence. The search algorithm compares the known statistical pattern of bases in e.g. a promoter, with an unknown sequence and calculates the statistical significance of the match at all positions in the unknown sequence. The program was tested on 54 published prokaryotic promoters. 44 or 49 could be found with 1 or 4 false answers, respectively. The program was also used on plasmid pBR322. All promoters functioning in an in vitro transcription system were found (tet, anti-tet, p4, bla and ori) except the so called p5 promoter. A search for donor and acceptor sites was performed in a human HLA genomic sequence that contains six introns. Five of the possible six donor and acceptor sites were found.

摘要

开发了一种新型搜索算法,用于查找核酸序列中遗传的生物信息。该算法属于模式匹配类型,其基于这样一个事实:遗传信息通常是序列部分内四个碱基可预测统计出现情况的函数。搜索算法将例如启动子中已知的碱基统计模式与未知序列进行比较,并计算未知序列中所有位置匹配的统计显著性。该程序在54个已发表的原核启动子上进行了测试。分别以1个或4个错误答案找到了44个或49个启动子。该程序还用于质粒pBR322。除了所谓的p5启动子外,发现了所有在体外转录系统中起作用的启动子(tet、抗tet、p4、bla和ori)。在包含六个内含子的人类HLA基因组序列中进行了供体和受体位点的搜索。找到了六个可能的供体和受体位点中的五个。

相似文献

1
Search algorithm for pattern match analysis of nucleic acid sequences.用于核酸序列模式匹配分析的搜索算法。
Nucleic Acids Res. 1983 May 11;11(9):2943-57. doi: 10.1093/nar/11.9.2943.
2
A novel method for promoter search enhanced by function-specific subgrouping of promoters--developed and tested on E.coli system.一种通过启动子功能特异性亚分组增强的新型启动子搜索方法——在大肠杆菌系统上开发并测试。
Nucleic Acids Res. 1989 Jun 26;17(12):4799-815. doi: 10.1093/nar/17.12.4799.
3
Escherichia coli promoters. II. A spacing class-dependent promoter search protocol.大肠杆菌启动子。II. 一种依赖间隔类别的启动子搜索方案。
J Biol Chem. 1989 Apr 5;264(10):5531-4.
4
Analysis of the occurrence of promoter-sites in DNA.DNA中启动子位点出现情况的分析。
Nucleic Acids Res. 1986 Jan 10;14(1):109-26. doi: 10.1093/nar/14.1.109.
5
Nucleotide sequence of an Escherichia coli tRNA (Leu 1) operon and identification of the transcription promoter signal.大肠杆菌tRNA(亮氨酸1)操纵子的核苷酸序列及转录启动子信号的鉴定。
Nucleic Acids Res. 1981 May 11;9(9):2121-39. doi: 10.1093/nar/9.9.2121.
6
Analysis of E.coli promoter structures using neural networks.使用神经网络分析大肠杆菌启动子结构。
Nucleic Acids Res. 1994 Jun 11;22(11):2158-65. doi: 10.1093/nar/22.11.2158.
7
Transcription regulation in vitro by an E. coli promoter containing a DNA cruciform in the '-35' region.含有位于“-35”区域的DNA十字形结构的大肠杆菌启动子的体外转录调控。
Nucleic Acids Res. 1989 Jul 25;17(14):5537-45. doi: 10.1093/nar/17.14.5537.
8
Analysis of the Escherichia coli genome VI: DNA sequence of the region from 92.8 through 100 minutes.大肠杆菌基因组分析VI:92.8至100分钟区域的DNA序列
Nucleic Acids Res. 1995 Jun 25;23(12):2105-19. doi: 10.1093/nar/23.12.2105.
9
A new method for finding long consensus patterns in nucleic acid sequences.一种在核酸序列中寻找长共有模式的新方法。
Comput Appl Biosci. 1991 Oct;7(4):495-500. doi: 10.1093/bioinformatics/7.4.495.
10
Transcription initiation at the tet promoter and effect of mutations.tet启动子处的转录起始及突变的影响。
Nucleic Acids Res. 1988 Aug 11;16(15):7269-85. doi: 10.1093/nar/16.15.7269.

引用本文的文献

1
Explainability in transformer models for functional genomics.用于功能基因组学的转换器模型的可解释性。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab060.
2
Inherent limitations of probabilistic models for protein-DNA binding specificity.蛋白质 - DNA 结合特异性概率模型的内在局限性。
PLoS Comput Biol. 2017 Jul 7;13(7):e1005638. doi: 10.1371/journal.pcbi.1005638. eCollection 2017 Jul.
3
Modeling the specificity of protein-DNA interactions.模拟蛋白质与DNA相互作用的特异性。
Quant Biol. 2013 Jun;1(2):115-130. doi: 10.1007/s40484-013-0012-4.
4
The next generation of transcription factor binding site prediction.下一代转录因子结合位点预测。
PLoS Comput Biol. 2013;9(9):e1003214. doi: 10.1371/journal.pcbi.1003214. Epub 2013 Sep 5.
5
Identification of cis-regulatory modules in promoters of human genes exploiting mutual positioning of transcription factors.利用转录因子的相互位置识别人类基因启动子中的顺式调控模块。
Nucleic Acids Res. 2013 Oct;41(19):8822-41. doi: 10.1093/nar/gkt578. Epub 2013 Aug 2.
6
Optimizing the GATA-3 position weight matrix to improve the identification of novel binding sites.优化 GATA-3 位置权重矩阵以提高新结合位点的识别能力。
BMC Genomics. 2012 Aug 22;13:416. doi: 10.1186/1471-2164-13-416.
7
Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites.用于改进DNA/蛋白质结合位点位置权重矩阵的计算技术。
Nucleic Acids Res. 2005 Apr 22;33(7):2290-301. doi: 10.1093/nar/gki519. Print 2005.
8
Identification and utilization of arbitrary correlations in models of recombination signal sequences.重组信号序列模型中任意相关性的识别与利用。
Genome Biol. 2002;3(12):RESEARCH0072. doi: 10.1186/gb-2002-3-12-research0072. Epub 2002 Nov 21.
9
Gene recognition via spliced sequence alignment.通过剪接序列比对进行基因识别。
Proc Natl Acad Sci U S A. 1996 Aug 20;93(17):9061-6. doi: 10.1073/pnas.93.17.9061.
10
The predicted amino acid sequence of the Salmonella typhimurium virulence gene mviAA(+) strongly indicates that MviA is a regulator protein of a previously unknown S. typhimurium response regulator family.鼠伤寒沙门氏菌毒力基因mviAA(+)的预测氨基酸序列强烈表明,MviA是鼠伤寒沙门氏菌一个此前未知的应答调节因子家族的调节蛋白。
Infect Immun. 1996 Jun;64(6):2365-7. doi: 10.1128/iai.64.6.2365-2367.1996.

本文引用的文献

1
Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.使用“感知器”算法区分大肠杆菌中的翻译起始位点。
Nucleic Acids Res. 1982 May 11;10(9):2997-3011. doi: 10.1093/nar/10.9.2997.
2
Organization of transcriptional signals in plasmids pBR322 and pACYC184.质粒pBR322和pACYC184中转录信号的组织方式。
Proc Natl Acad Sci U S A. 1981 Jan;78(1):167-71. doi: 10.1073/pnas.78.1.167.
3
Nucleotide sequence of the proximal portion of the RNA polymerase beta subunit gene of Escherichia coli.大肠杆菌RNA聚合酶β亚基基因近端部分的核苷酸序列。
Gene. 1980 Nov;11(3-4):367-73. doi: 10.1016/0378-1119(80)90076-1.
4
Spacer mutations in the lac ps promoter.乳糖操纵子启动子中的间隔突变
Proc Natl Acad Sci U S A. 1982 Feb;79(4):1069-72. doi: 10.1073/pnas.79.4.1069.
5
Enhanced graphic matrix analysis of nucleic acid and protein sequences.核酸和蛋白质序列的增强图形矩阵分析
Proc Natl Acad Sci U S A. 1981 Dec;78(12):7665-9. doi: 10.1073/pnas.78.12.7665.
6
Organization and expression of eucaryotic split genes coding for proteins.编码蛋白质的真核生物断裂基因的组织与表达。
Annu Rev Biochem. 1981;50:349-83. doi: 10.1146/annurev.bi.50.070181.002025.
7
Exon/intron organization and complete nucleotide sequence of an HLA gene.一个HLA基因的外显子/内含子结构及完整核苷酸序列
Proc Natl Acad Sci U S A. 1982 Feb;79(3):893-7. doi: 10.1073/pnas.79.3.893.
8
A lac promoter with a changed distance between -10 and -35 regions.一种在-10区和-35区之间距离发生改变的乳糖启动子。
Nucleic Acids Res. 1982 Feb 11;10(3):903-12. doi: 10.1093/nar/10.3.903.
9
Internal promoters of the rpoBC operon of Escherichia coli.大肠杆菌rpoBC操纵子的内部启动子。
Mol Gen Genet. 1981;184(3):548-50. doi: 10.1007/BF00352538.
10
The primary structure of Escherichia coli RNA polymerase. Nucleotide sequence of the rpoB gene and amino-acid sequence of the beta-subunit.大肠杆菌RNA聚合酶的一级结构。rpoB基因的核苷酸序列和β亚基的氨基酸序列。
Eur J Biochem. 1981 Jun 1;116(3):621-9. doi: 10.1111/j.1432-1033.1981.tb05381.x.