• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于寻找共有序列的模拟退火算法。

A simulated annealing algorithm for finding consensus sequences.

作者信息

Keith Jonathan M, Adams Peter, Bryant Darryn, Kroese Dirk P, Mitchelson Keith R, Cochran Duncan A E, Lala Gita H

机构信息

Department of Mathematics, The University of Queensland, Qld 4072, Australia.

出版信息

Bioinformatics. 2002 Nov;18(11):1494-9. doi: 10.1093/bioinformatics/18.11.1494.

DOI:10.1093/bioinformatics/18.11.1494
PMID:12424121
Abstract

MOTIVATION

A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules.

RESULTS

This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.

摘要

动机

顾名思义,相关序列家族的共有序列是一种能够捕捉该家族大多数成员共同特征的序列。共有序列在各种DNA测序应用中都很重要,并且是表征一类分子的便捷方式。

结果

本文描述了一种使用称为模拟退火的流行优化方法来寻找共有序列的新算法。与通过首先形成多序列比对来寻找共有序列的传统方法不同,该算法搜索的序列能使与每个输入序列的成对距离之和最小化。然后,得到的共有序列可用于诱导多序列比对。该算法所需的时间与输入序列的数量呈线性比例关系,与共有序列的长度呈二次方比例关系。我们展示的结果表明了新算法产生的共有序列和比对的高质量。为作比较,我们还展示了使用ClustalW获得的类似结果。在许多情况下,新算法的性能优于ClustalW。

相似文献

1
A simulated annealing algorithm for finding consensus sequences.一种用于寻找共有序列的模拟退火算法。
Bioinformatics. 2002 Nov;18(11):1494-9. doi: 10.1093/bioinformatics/18.11.1494.
2
Generating consensus sequences from partial order multiple sequence alignment graphs.从偏序多序列比对图生成一致序列。
Bioinformatics. 2003 May 22;19(8):999-1008. doi: 10.1093/bioinformatics/btg109.
3
Multiple alignment using hidden Markov models.使用隐马尔可夫模型进行多重比对。
Proc Int Conf Intell Syst Mol Biol. 1995;3:114-20.
4
Subtle motifs: defining the limits of motif finding algorithms.微妙基序:界定基序查找算法的局限性
Bioinformatics. 2002 Oct;18(10):1382-90. doi: 10.1093/bioinformatics/18.10.1382.
5
Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid.通过粒子群优化-进化算法混合改进用于多序列比对的隐马尔可夫模型训练
Biosystems. 2003 Nov;72(1-2):5-17. doi: 10.1016/s0303-2647(03)00131-x.
6
A graph based algorithm for generating EST consensus sequences.一种基于图形的用于生成EST一致性序列的算法。
Bioinformatics. 2005 Apr 15;21(8):1371-5. doi: 10.1093/bioinformatics/bti184. Epub 2004 Nov 30.
7
Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.应用于DNA测序的隐马尔可夫链的贝叶斯恢复
J Comput Biol. 1999 Summer;6(2):261-77. doi: 10.1089/cmb.1999.6.261.
8
Finding motifs in the twilight zone.在模糊地带寻找基序。
Bioinformatics. 2002 Oct;18(10):1374-81. doi: 10.1093/bioinformatics/18.10.1374.
9
Algorithms for sequence analysis via mutagenesis.通过诱变进行序列分析的算法。
Bioinformatics. 2004 Oct 12;20(15):2401-10. doi: 10.1093/bioinformatics/bth258. Epub 2004 May 14.
10
Detecting recombination with MCMC.使用马尔可夫链蒙特卡罗方法检测重组。
Bioinformatics. 2002;18 Suppl 1:S345-53. doi: 10.1093/bioinformatics/18.suppl_1.s345.

引用本文的文献

1
SUP: a probabilistic framework to propagate genome sequence uncertainty, with applications.SUP:一个用于传播基因组序列不确定性的概率框架及其应用
NAR Genom Bioinform. 2023 Apr 24;5(2):lqad038. doi: 10.1093/nargab/lqad038. eCollection 2023 Jun.
2
Analysis of 11,430 recombinant protein production experiments reveals that protein yield is tunable by synonymous codon changes of translation initiation sites.分析 11430 个重组蛋白生产实验表明,蛋白质产量可以通过翻译起始位点的同义密码子变化进行调节。
PLoS Comput Biol. 2021 Oct 5;17(10):e1009461. doi: 10.1371/journal.pcbi.1009461. eCollection 2021 Oct.
3
An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing.
利用诱变和下一代测序对重复基因组区域和结构变异进行测序的改良方案。
PLoS One. 2012;7(8):e43359. doi: 10.1371/journal.pone.0043359. Epub 2012 Aug 17.
4
Unlocking hidden genomic sequence.解锁隐藏的基因组序列。
Nucleic Acids Res. 2004 Feb 18;32(3):e35. doi: 10.1093/nar/gnh022.