Suppr超能文献

通过带停止时间的重要性抽样估计随机序列局部比对的耿贝尔尺度参数。

ESTIMATING THE GUMBEL SCALE PARAMETER FOR LOCAL ALIGNMENT OF RANDOM SEQUENCES BY IMPORTANCE SAMPLING WITH STOPPING TIMES.

作者信息

Park Yonil, Sheetlin Sergey, Spouge John L

机构信息

National Center for Biotechnology Information National Library of Medicine National Institutes of Health 8600 Rockville Pike Bethesda, Maryland 20894 USA.

出版信息

Ann Stat. 2009 Dec 1;37(6A):3697. doi: 10.1214/08-AOS663.

Abstract

The gapped local alignment score of two random sequences follows a Gumbel distribution. If computers could estimate the parameters of the Gumbel distribution within one second, the use of arbitrary alignment scoring schemes could increase the sensitivity of searching biological sequence databases over the web. Accordingly, this article gives a novel equation for the scale parameter of the relevant Gumbel distribution. We speculate that the equation is exact, although present numerical evidence is limited. The equation involves ascending ladder variates in the global alignment of random sequences. In global alignment simulations, the ladder variates yield stopping times specifying random sequence lengths. Because of the random lengths, and because our trial distribution for importance sampling occurs on a different sample space from our target distribution, our study led to a mapping theorem, which led naturally in turn to an efficient dynamic programming algorithm for the importance sampling weights. Numerical studies using several popular alignment scoring schemes then examined the efficiency and accuracy of the resulting simulations.

摘要

两个随机序列的间隙局部比对得分服从耿贝尔分布。如果计算机能够在一秒内估计出耿贝尔分布的参数,那么使用任意比对计分方案都可以提高在网络上搜索生物序列数据库的灵敏度。因此,本文给出了一个关于相关耿贝尔分布尺度参数的新方程。我们推测该方程是精确的,尽管目前的数值证据有限。该方程涉及随机序列全局比对中的上升阶梯变量。在全局比对模拟中,阶梯变量产生指定随机序列长度的停止时间。由于序列长度是随机的,并且由于我们用于重要性抽样的试验分布发生在与目标分布不同的样本空间上,我们的研究得出了一个映射定理,进而自然地引出了一种用于重要性抽样权重的高效动态规划算法。然后,使用几种流行的比对计分方案进行的数值研究检验了所得模拟的效率和准确性。

相似文献

6
Significance of gapped sequence alignments.缺口序列比对的意义。
J Comput Biol. 2008 Nov;15(9):1187-94. doi: 10.1089/cmb.2008.0125.

引用本文的文献

2
How sequence alignment scores correspond to probability models.序列比对分数如何对应概率模型。
Bioinformatics. 2020 Jan 15;36(2):408-415. doi: 10.1093/bioinformatics/btz576.
3
ALP & FALP: C++ libraries for pairwise local alignment E-values.ALP和FALP:用于成对局部比对E值的C++库。
Bioinformatics. 2016 Jan 15;32(2):304-5. doi: 10.1093/bioinformatics/btv575. Epub 2015 Oct 1.
4
Frameshift alignment: statistics and post-genomic applications.移码校正:统计与后基因组学应用。
Bioinformatics. 2014 Dec 15;30(24):3575-82. doi: 10.1093/bioinformatics/btu576. Epub 2014 Aug 28.
6
Objective method for estimating asymptotic parameters, with an application to sequence alignment.估计渐近参数的客观方法及其在序列比对中的应用。
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Sep;84(3 Pt 1):031914. doi: 10.1103/PhysRevE.84.031914. Epub 2011 Sep 13.

本文引用的文献

4
Asymmetric exclusion process and extremal statistics of random sequences.非对称排斥过程与随机序列的极值统计
Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Mar;65(3 Pt 1):031911. doi: 10.1103/PhysRevE.65.031911. Epub 2002 Mar 5.
10
Local sequence alignments with monotonic gap penalties.具有单调空位罚分的局部序列比对。
Bioinformatics. 1999 Jun;15(6):455-62. doi: 10.1093/bioinformatics/15.6.455.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验