• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种序列比较的新方法:标准化序列比对。

A new approach to sequence comparison: normalized sequence alignment.

作者信息

Arslan A N, Eğecioğlu O, Pevzner P A

机构信息

Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA 92093, USA.

出版信息

Bioinformatics. 2001 Apr;17(4):327-37. doi: 10.1093/bioinformatics/17.4.327.

DOI:10.1093/bioinformatics/17.4.327
PMID:11301301
Abstract

The Smith-Waterman algorithm for local sequence alignment is one of the most important techniques in computational molecular biology. This ingenious dynamic programming approach was designed to reveal the highly conserved fragments by discarding poorly conserved initial and terminal segments. However, the existing notion of local similarity has a serious flaw: it does not discard poorly conserved intermediate segments. The Smith-Waterman algorithm finds the local alignment with maximal score but it is unable to find local alignment with maximum degree of similarity (e.g. maximal percent of matches). Moreover, there is still no efficient algorithm that answers the following natural question: do two sequences share a (sufficiently long) fragment with more than 70% of similarity? As a result, the local alignment sometimes produces a mosaic of well-conserved fragments artificially connected by poorly-conserved or even unrelated fragments. This may lead to problems in comparison of long genomic sequences and comparative gene prediction as recently pointed out by Zhang et al. (Bioinformatics, 15, 1012-1019, 1999). In this paper we propose a new sequence comparison algorithm (normalized local alignment ) that reports the regions with maximum degree of similarity. The algorithm is based on fractional programming and its running time is O(n2log n). In practice, normalized local alignment is only 3-5 times slower than the standard Smith-Waterman algorithm.

摘要

用于局部序列比对的史密斯-沃特曼算法是计算分子生物学中最重要的技术之一。这种巧妙的动态规划方法旨在通过舍弃保守性较差的起始和末端片段来揭示高度保守的片段。然而,现有的局部相似性概念存在一个严重缺陷:它不会舍弃保守性较差的中间片段。史密斯-沃特曼算法能找到得分最高的局部比对,但无法找到相似度最高的局部比对(例如匹配百分比最高)。此外,仍然没有一种有效的算法能回答以下自然问题:两个序列是否共享一个相似度超过70%的(足够长的)片段?因此,局部比对有时会产生由保守性较差甚至不相关的片段人为连接起来的高度保守片段的拼接图。正如张等人最近指出的(《生物信息学》,第15卷,第1012 - 1019页,1999年),这可能会在长基因组序列比较和比较基因预测中导致问题。在本文中,我们提出了一种新的序列比较算法(归一化局部比对),该算法能报告相似度最高的区域。该算法基于分式规划,其运行时间为O(n2log n)。在实际应用中,归一化局部比对仅比标准的史密斯-沃特曼算法慢3至5倍。

相似文献

1
A new approach to sequence comparison: normalized sequence alignment.一种序列比较的新方法:标准化序列比对。
Bioinformatics. 2001 Apr;17(4):327-37. doi: 10.1093/bioinformatics/17.4.327.
2
Pairwise alignment for very long nucleic acid sequences.非常长的核酸序列的两两比对。
Biochem Biophys Res Commun. 2018 Jul 20;502(3):313-317. doi: 10.1016/j.bbrc.2018.05.134. Epub 2018 May 29.
3
Pairwise alignment of nucleotide sequences using maximal exact matches.使用最大完全匹配进行核苷酸序列的两两比对。
BMC Bioinformatics. 2019 May 21;20(1):261. doi: 10.1186/s12859-019-2827-0.
4
Optimizing amino acid substitution matrices with a local alignment kernel.使用局部比对核优化氨基酸替换矩阵。
BMC Bioinformatics. 2006 May 5;7:246. doi: 10.1186/1471-2105-7-246.
5
ADEPT: a domain independent sequence alignment strategy for gpu architectures.ADEPT:一种适用于 GPU 架构的与领域无关的序列比对策略。
BMC Bioinformatics. 2020 Sep 15;21(1):406. doi: 10.1186/s12859-020-03720-1.
6
Heuristic reusable dynamic programming: efficient updates of local sequence alignment.启发式可重用动态规划:局部序列比对的高效更新。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):570-82. doi: 10.1109/TCBB.2009.30.
7
Adaptive Smith-Waterman residue match seeding for protein structural alignment.自适应 Smith-Waterman 残基匹配种子法用于蛋白质结构比对。
Proteins. 2013 Oct;81(10):1823-39. doi: 10.1002/prot.24327. Epub 2013 Aug 19.
8
Pairwise sequence alignment for very long sequences on GPUs.在图形处理器(GPU)上对超长序列进行成对序列比对。
Int J Bioinform Res Appl. 2014;10(4-5):345-68. doi: 10.1504/IJBRA.2014.062989.
9
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment.CUDA兼容的GPU卡作为用于Smith-Waterman序列比对的高效硬件加速器。
BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S10. doi: 10.1186/1471-2105-9-S2-S10.
10
Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms.搜索蛋白质序列文库:Smith-Waterman算法与FASTA算法的灵敏度和选择性比较
Genomics. 1991 Nov;11(3):635-50. doi: 10.1016/0888-7543(91)90071-l.

引用本文的文献

1
Detection of long non-coding RNA homology, a comparative study on alignment and alignment-free metrics.检测长非编码 RNA 同源性:对齐和无对齐度量的比较研究。
BMC Bioinformatics. 2018 Nov 6;19(1):407. doi: 10.1186/s12859-018-2441-6.
2
Analysis of 5' gene regions reveals extraordinary conservation of novel non-coding sequences in a wide range of animals.对5'基因区域的分析揭示了多种动物中新型非编码序列的异常保守性。
BMC Evol Biol. 2015 Oct 19;15:227. doi: 10.1186/s12862-015-0499-6.
3
Adjusting scoring matrices to correct overextended alignments.
调整评分矩阵以纠正过度延伸的比对。
Bioinformatics. 2013 Dec 1;29(23):3007-13. doi: 10.1093/bioinformatics/btt517. Epub 2013 Aug 31.
4
Accurate statistics for local sequence alignment with position-dependent scoring by rare-event sampling.基于稀有事件抽样的位置相关评分的局部序列比对的精确统计。
BMC Bioinformatics. 2011 Feb 3;12:47. doi: 10.1186/1471-2105-12-47.
5
iPARTS: an improved tool of pairwise alignment of RNA tertiary structures.iPARTS:一种改进的 RNA 三级结构比对工具。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W340-7. doi: 10.1093/nar/gkq483. Epub 2010 May 27.
6
A new measurement of sequence conservation.序列保守性的新度量。
BMC Genomics. 2009 Dec 22;10:623. doi: 10.1186/1471-2164-10-623.
7
SARSA: a web tool for structural alignment of RNA using a structural alphabet.SARSA:一种使用结构字母表进行RNA结构比对的网络工具。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W19-24. doi: 10.1093/nar/gkn327. Epub 2008 May 23.
8
A pattern matching approach for the estimation of alignment between any two given DNA sequences.一种用于估计任意两个给定DNA序列之间比对的模式匹配方法。
J Med Syst. 2007 Aug;31(4):247-53. doi: 10.1007/s10916-007-9062-3.
9
Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs.使用miraEST组装程序在已测序的EST中进行可靠且自动化的mRNA转录本组装和SNP检测。
Genome Res. 2004 Jun;14(6):1147-59. doi: 10.1101/gr.1917404. Epub 2004 May 12.
10
Minimal entropy probability paths between genome families.基因组家族之间的最小熵概率路径。
J Math Biol. 2004 May;48(5):563-90. doi: 10.1007/s00285-003-0248-0. Epub 2003 Dec 2.