• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从(斯坦纳)树中选取比对。

Picking alignments from (Steiner) trees.

作者信息

Lam Fumei, Alexandersson Marina, Pachter Lior

机构信息

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

出版信息

J Comput Biol. 2003;10(3-4):509-20. doi: 10.1089/10665270360688156.

DOI:10.1089/10665270360688156
PMID:12935341
Abstract

The application of Needleman-Wunsch alignment techniques to biological sequences is complicated by two serious problems when the sequences are long: the running time, which scales as the product of the lengths of sequences, and the difficulty in obtaining suitable parameters that produce meaningful alignments. The running time problem is often corrected by reducing the search space, using techniques such as banding, or chaining of high-scoring pairs. The parameter problem is more difficult to fix, partly because the probabilistic model, which Needleman-Wunsch is equivalent to, does not capture a key feature of biological sequence alignments, namely the alternation of conserved blocks and seemingly unrelated nonconserved segments. We present a solution to the problem of designing efficient search spaces for pair hidden Markov models that align biological sequences by taking advantage of their associated features. Our approach leads to an optimization problem, for which we obtain a 2-approximation algorithm, and that is based on the construction of Manhattan networks, which are close relatives of Steiner trees. We describe the underlying theory and show how our methods can be applied to alignment of DNA sequences in practice, successfully reducing the Viterbi algorithm search space of alignment PHMMs by three orders of magnitude.

摘要

当序列较长时,将Needleman-Wunsch比对技术应用于生物序列会受到两个严重问题的困扰:运行时间,它与序列长度的乘积成正比;以及难以获得能产生有意义比对的合适参数。运行时间问题通常通过减少搜索空间来解决,例如使用带状比对或高分对链接等技术。参数问题则更难解决,部分原因是与Needleman-Wunsch等价的概率模型没有捕捉到生物序列比对的一个关键特征,即保守块和看似不相关的非保守片段的交替。我们提出了一种解决方案,通过利用生物序列的相关特征为成对隐马尔可夫模型设计高效的搜索空间。我们的方法导致了一个优化问题,对于这个问题我们得到了一个2近似算法,该算法基于曼哈顿网络的构建,曼哈顿网络是斯坦纳树的近亲。我们描述了其基础理论,并展示了我们的方法在实际中如何应用于DNA序列比对,成功地将比对PHMMs的维特比算法搜索空间减少了三个数量级。

相似文献

1
Picking alignments from (Steiner) trees.从(斯坦纳)树中选取比对。
J Comput Biol. 2003;10(3-4):509-20. doi: 10.1089/10665270360688156.
2
Pair hidden Markov models on tree structures.树结构上的成对隐马尔可夫模型。
Bioinformatics. 2003;19 Suppl 1:i232-40. doi: 10.1093/bioinformatics/btg1032.
3
Dynamic programming.动态规划
Methods Mol Biol. 2014;1079:3-27. doi: 10.1007/978-1-62703-646-7_1.
4
Hidden Markov models and optimized sequence alignments.隐马尔可夫模型与优化序列比对
Comput Biol Chem. 2003 Feb;27(1):77-84. doi: 10.1016/s1476-9271(02)00096-8.
5
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
6
A teaching approach from the exhaustive search method to the Needleman-Wunsch algorithm.一种从穷举搜索法到Needleman-Wunsch算法的教学方法。
Biochem Mol Biol Educ. 2017 May;45(3):194-204. doi: 10.1002/bmb.21027. Epub 2016 Oct 14.
7
Statistical significance of probabilistic sequence alignment and related local hidden Markov models.概率序列比对及相关局部隐马尔可夫模型的统计学显著性。
J Comput Biol. 2001;8(3):249-82. doi: 10.1089/10665270152530845.
8
An algorithm for progressive multiple alignment of sequences with insertions.一种用于含插入序列的渐进多序列比对算法。
Proc Natl Acad Sci U S A. 2005 Jul 26;102(30):10557-62. doi: 10.1073/pnas.0409137102. Epub 2005 Jul 6.
9
DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment.DIALIGN-T:一种改进的基于片段的多序列比对算法。
BMC Bioinformatics. 2005 Mar 22;6:66. doi: 10.1186/1471-2105-6-66.
10
From analysis of protein structural alignments toward a novel approach to align protein sequences.从蛋白质结构比对分析到一种比对蛋白质序列的新方法。
Proteins. 2004 Feb 15;54(3):569-82. doi: 10.1002/prot.10503.

引用本文的文献

1
SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.SLAM:跨物种基因发现及使用广义配对隐马尔可夫模型进行比对
Genome Res. 2003 Mar;13(3):496-502. doi: 10.1101/gr.424203.