• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

带间隙局部比对的极值统计量快速评估。

Rapid assessment of extremal statistics for gapped local alignment.

作者信息

Olsen R, Bundschuh R, Hwa T

机构信息

Department of Physics, University of California at San Diego, La Jolla 92093-0319, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1999:211-22.

PMID:10786304
Abstract

The statistical significance of gapped local alignments is characterized by analyzing the extremal statistics of the scores obtained from the alignment of random amino acid sequences. By identifying a complete set of linked clusters, "islands," we devise a method which accurately predicts the extremal score statistics by using only one to a few pairwise alignments. The success of our method relies crucially on the link between the statistics of island scores and extremal score statistics. This link is motivated by heuristic arguments, and firmly established by extensive numerical simulations for a variety of scoring parameter settings and sequence lengths. Our approach is several orders of magnitude faster than the widely used shuffling method, since island counting is trivially incorporated into the basic Smith-Waterman alignment algorithm with minimal computational cost, and all islands are counted in a single alignment. The availability of a rapid and accurate significance estimation method gives one the flexibility to fine tune scoring parameters to detect weakly homologous sequences and obtain optimal alignment fidelity.

摘要

通过分析从随机氨基酸序列比对中获得的得分的极值统计,来表征有间隙局部比对的统计学显著性。通过识别一组完整的相连簇,即“岛”,我们设计了一种方法,该方法仅使用一到几个两两比对就能准确预测极值得分统计。我们方法的成功关键依赖于岛得分统计与极值得分统计之间的联系。这种联系是由启发式论证推动的,并通过对各种评分参数设置和序列长度进行广泛的数值模拟得到了有力验证。我们的方法比广泛使用的重排方法快几个数量级,因为岛计数可以以最小的计算成本轻松地纳入基本的史密斯-沃特曼比对算法中,并且所有岛都在一次比对中计数。快速准确的显著性估计方法的可用性使人们能够灵活地微调评分参数,以检测弱同源序列并获得最佳比对保真度。

相似文献

1
Rapid assessment of extremal statistics for gapped local alignment.带间隙局部比对的极值统计量快速评估。
Proc Int Conf Intell Syst Mol Biol. 1999:211-22.
2
Statistical significance of probabilistic sequence alignment and related local hidden Markov models.概率序列比对及相关局部隐马尔可夫模型的统计学显著性。
J Comput Biol. 2001;8(3):249-82. doi: 10.1089/10665270152530845.
3
Replica model for an unusual directed polymer in 1+1 dimensions and prediction of the extremal parameter of gapped sequence alignment statistics.1+1维中一种特殊有向聚合物的复制模型及带隙序列比对统计极值参数的预测。
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Jun;69(6 Pt 1):061904. doi: 10.1103/PhysRevE.69.061904. Epub 2004 Jun 1.
4
Convergent Island Statistics: a fast method for determining local alignment score significance.收敛岛统计:一种确定局部比对得分显著性的快速方法。
Bioinformatics. 2005 Jun 15;21(12):2827-31. doi: 10.1093/bioinformatics/bti433. Epub 2005 Apr 7.
5
Toward an accurate statistics of gapped alignments.迈向空位比对的精确统计。
Bull Math Biol. 2005 Jan;67(1):169-91. doi: 10.1016/j.bulm.2004.07.001.
6
SALSA: improved protein database searching by a new algorithm for assembly of sequence fragments into gapped alignments.SALSA:通过一种将序列片段组装成带空位比对的新算法改进蛋白质数据库搜索。
Bioinformatics. 1998;14(10):839-45. doi: 10.1093/bioinformatics/14.10.839.
7
Accurate anchoring alignment of divergent sequences.发散序列的精确锚定比对。
Bioinformatics. 2006 Jan 1;22(1):29-34. doi: 10.1093/bioinformatics/bti772. Epub 2005 Nov 13.
8
From analysis of protein structural alignments toward a novel approach to align protein sequences.从蛋白质结构比对分析到一种比对蛋白质序列的新方法。
Proteins. 2004 Feb 15;54(3):569-82. doi: 10.1002/prot.10503.
9
Asymmetric exclusion process and extremal statistics of random sequences.非对称排斥过程与随机序列的极值统计
Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Mar;65(3 Pt 1):031911. doi: 10.1103/PhysRevE.65.031911. Epub 2002 Mar 5.
10
Approximate statistics of gapped alignments.带空位比对的近似统计量。
J Comput Biol. 1999 Spring;6(1):91-112. doi: 10.1089/cmb.1999.6.91.

引用本文的文献

1
Confidence assignment for mass spectrometry based peptide identifications via the extreme value distribution.基于极值分布的质谱肽段鉴定的置信度赋值
Bioinformatics. 2016 Sep 1;32(17):2642-9. doi: 10.1093/bioinformatics/btw225. Epub 2016 Apr 29.
2
Statistical Mechanics of Transcription-Factor Binding Site Discovery Using Hidden Markov Models.使用隐马尔可夫模型发现转录因子结合位点的统计力学
J Stat Phys. 2011 Apr;142(6):1187-1205. doi: 10.1007/s10955-010-0102-x.
3
Accelerating pairwise statistical significance estimation for local alignment by harvesting GPU's power.
利用 GPU 加速局部比对的成对统计显著性估计。
BMC Bioinformatics. 2012 Apr 12;13 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2105-13-S5-S3.
4
PhyLAT: a phylogenetic local alignment tool.PhyLAT:一种系统发生的局部比对工具。
Bioinformatics. 2012 May 15;28(10):1336-44. doi: 10.1093/bioinformatics/bts158. Epub 2012 Apr 6.
5
Objective method for estimating asymptotic parameters, with an application to sequence alignment.估计渐近参数的客观方法及其在序列比对中的应用。
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Sep;84(3 Pt 1):031914. doi: 10.1103/PhysRevE.84.031914. Epub 2011 Sep 13.
6
ESTIMATING THE GUMBEL SCALE PARAMETER FOR LOCAL ALIGNMENT OF RANDOM SEQUENCES BY IMPORTANCE SAMPLING WITH STOPPING TIMES.通过带停止时间的重要性抽样估计随机序列局部比对的耿贝尔尺度参数。
Ann Stat. 2009 Dec 1;37(6A):3697. doi: 10.1214/08-AOS663.
7
Back-translation for discovering distant protein homologies in the presence of frameshift mutations.用于在存在移码突变的情况下发现远距离蛋白质同源性的反向翻译。
Algorithms Mol Biol. 2010 Jan 4;5(1):6. doi: 10.1186/1748-7188-5-6.
8
Island method for estimating the statistical significance of profile-profile alignment scores.用于估计序列轮廓与序列轮廓比对得分统计显著性的岛方法。
BMC Bioinformatics. 2009 Apr 20;10:112. doi: 10.1186/1471-2105-10-112.
9
Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty.使用多个参数集进行局部序列比对的成对统计显著性以及参数集变化罚分的经验依据。
BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-10-S3-S1.
10
Powerful fusion: PSI-BLAST and consensus sequences.强大的融合:PSI-BLAST与共有序列
Bioinformatics. 2008 Sep 15;24(18):1987-93. doi: 10.1093/bioinformatics/btn384. Epub 2008 Aug 4.