• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

根据天然分布随机生成RNA二级结构。

Random generation of RNA secondary structures according to native distributions.

作者信息

Nebel Markus E, Scheid Anika, Weinberg Frank

机构信息

Department of Computer Science, University of Kaiserslautern, Germany.

出版信息

Algorithms Mol Biol. 2011 Oct 12;6:24. doi: 10.1186/1748-7188-6-24.

DOI:10.1186/1748-7188-6-24
PMID:21992500
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3354341/
Abstract

BACKGROUND

Random biological sequences are a topic of great interest in genome analysis since, according to a powerful paradigm, they represent the background noise from which the actual biological information must differentiate. Accordingly, the generation of random sequences has been investigated for a long time. Similarly, random object of a more complicated structure like RNA molecules or proteins are of interest.

RESULTS

In this article, we present a new general framework for deriving algorithms for the non-uniform random generation of combinatorial objects according to the encoding and probability distribution implied by a stochastic context-free grammar. Briefly, the framework extends on the well-known recursive method for (uniform) random generation and uses the popular framework of admissible specifications of combinatorial classes, introducing weighted combinatorial classes to allow for the non-uniform generation by means of unranking. This framework is used to derive an algorithm for the generation of RNA secondary structures of a given fixed size. We address the random generation of these structures according to a realistic distribution obtained from real-life data by using a very detailed context-free grammar (that models the class of RNA secondary structures by distinguishing between all known motifs in RNA structure). Compared to well-known sampling approaches used in several structure prediction tools (such as SFold) ours has two major advantages: Firstly, after a preprocessing step in time O(n2) for the computation of all weighted class sizes needed, with our approach a set of m random secondary structures of a given structure size n can be computed in worst-case time complexity Om⋅n⋅ log(n) while other algorithms typically have a runtime in O(m⋅n2). Secondly, our approach works with integer arithmetic only which is faster and saves us from all the discomforting details of using floating point arithmetic with logarithmized probabilities.

CONCLUSION

A number of experimental results shows that our random generation method produces realistic output, at least with respect to the appearance of the different structural motifs. The algorithm is available as a webservice at http://wwwagak.cs.uni-kl.de/NonUniRandGen and can be used for generating random secondary structures of any specified RNA type. A link to download an implementation of our method (in Wolfram Mathematica) can be found there, too.

摘要

背景

随机生物序列是基因组分析中一个备受关注的主题,因为根据一个强大的范式,它们代表了实际生物信息必须从中区分出来的背景噪声。因此,随机序列的生成已经被研究了很长时间。同样,像RNA分子或蛋白质这样具有更复杂结构的随机对象也备受关注。

结果

在本文中,我们提出了一个新的通用框架,用于根据随机上下文无关文法所隐含的编码和概率分布,推导组合对象非均匀随机生成的算法。简而言之,该框架扩展了用于(均匀)随机生成的著名递归方法,并使用了组合类可允许规范的流行框架,引入加权组合类以允许通过逆序排序进行非均匀生成。这个框架被用于推导一个生成给定固定大小RNA二级结构的算法。我们通过使用一个非常详细的上下文无关文法(通过区分RNA结构中的所有已知基序来对RNA二级结构类进行建模),根据从实际数据获得的现实分布来处理这些结构的随机生成。与几种结构预测工具(如SFold)中使用的著名采样方法相比,我们的方法有两个主要优点:首先,在进行一次时间复杂度为O(n²)的预处理步骤以计算所需的所有加权类大小之后,使用我们的方法,在最坏情况下,一组大小为n的m个随机二级结构可以在时间复杂度O(m·n·log(n))内计算出来,而其他算法通常具有O(m·n²)的运行时间。其次,我们的方法仅使用整数运算,这更快,并且使我们避免了使用带有对数概率的浮点运算所带来的所有麻烦细节。

结论

大量实验结果表明,我们的随机生成方法至少在不同结构基序的出现方面产生了现实的输出。该算法可作为网络服务在http://wwwagak.cs.uni-kl.de/NonUniRandGen上获取,可用于生成任何指定RNA类型的随机二级结构。在那里也可以找到下载我们方法实现(用Wolfram Mathematica)的链接。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/806e51f56c5b/1748-7188-6-24-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/298f0eeae75f/1748-7188-6-24-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/793db95787ef/1748-7188-6-24-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/fb45b5d1beba/1748-7188-6-24-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/2668408030c6/1748-7188-6-24-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/30b3ef707542/1748-7188-6-24-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/a15376a1135a/1748-7188-6-24-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/806e51f56c5b/1748-7188-6-24-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/298f0eeae75f/1748-7188-6-24-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/793db95787ef/1748-7188-6-24-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/fb45b5d1beba/1748-7188-6-24-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/2668408030c6/1748-7188-6-24-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/30b3ef707542/1748-7188-6-24-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/a15376a1135a/1748-7188-6-24-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb9b/3354341/806e51f56c5b/1748-7188-6-24-7.jpg

相似文献

1
Random generation of RNA secondary structures according to native distributions.根据天然分布随机生成RNA二级结构。
Algorithms Mol Biol. 2011 Oct 12;6:24. doi: 10.1186/1748-7188-6-24.
2
Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures.评估集合分布紊乱对基于上下文无关文法的 RNA 二级结构统计抽样的影响。
BMC Bioinformatics. 2012 Jul 9;13:159. doi: 10.1186/1471-2105-13-159.
3
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
4
RNA Secondary Structures with Given Motif Specification: Combinatorics and Algorithms.具有给定模体特征的 RNA 二级结构:组合学与算法。
Bull Math Biol. 2023 Feb 13;85(3):21. doi: 10.1007/s11538-023-01128-5.
5
A statistical sampling algorithm for RNA secondary structure prediction.一种用于RNA二级结构预测的统计抽样算法。
Nucleic Acids Res. 2003 Dec 15;31(24):7280-301. doi: 10.1093/nar/gkg938.
6
Evaluation of a sophisticated SCFG design for RNA secondary structure prediction.用于RNA二级结构预测的复杂SCFG设计评估。
Theory Biosci. 2011 Dec;130(4):313-36. doi: 10.1007/s12064-011-0139-7. Epub 2011 Dec 2.
7
TurboFold: iterative probabilistic estimation of secondary structures for multiple RNA sequences.TurboFold:用于多个 RNA 序列的二级结构的迭代概率估计。
BMC Bioinformatics. 2011 Apr 20;12:108. doi: 10.1186/1471-2105-12-108.
8
Topological language for RNA.RNA的拓扑语言
Math Biosci. 2016 Dec;282:109-120. doi: 10.1016/j.mbs.2016.10.006. Epub 2016 Oct 20.
9
Asymptotic distribution of motifs in a stochastic context-free grammar model of RNA folding.RNA折叠随机上下文无关语法模型中基序的渐近分布
J Math Biol. 2014 Dec;69(6-7):1743-72. doi: 10.1007/s00285-013-0750-y. Epub 2014 Jan 3.
10
Combinatorics of RNA Secondary Structures with Base Triples.具有碱基三联体的RNA二级结构组合学
J Comput Biol. 2015 Jul;22(7):619-48. doi: 10.1089/cmb.2013.0022. Epub 2015 Jun 22.

引用本文的文献

1
Improved gravitation field algorithm and its application in hierarchical clustering.改进的引力场算法及其在层次聚类中的应用。
PLoS One. 2012;7(11):e49039. doi: 10.1371/journal.pone.0049039. Epub 2012 Nov 16.
2
Evaluation of a sophisticated SCFG design for RNA secondary structure prediction.用于RNA二级结构预测的复杂SCFG设计评估。
Theory Biosci. 2011 Dec;130(4):313-36. doi: 10.1007/s12064-011-0139-7. Epub 2011 Dec 2.

本文引用的文献

1
Analysis of the free energy in a stochastic RNA secondary structure model.随机 RNA 二级结构模型中的自由能分析。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1468-82. doi: 10.1109/TCBB.2010.126.
2
TRANSAT-- method for detecting the conserved helices of functional RNA structures, including transient, pseudo-knotted and alternative structures.TRANSAT——一种用于检测功能 RNA 结构保守螺旋的方法,包括瞬态、拟结和替代结构。
PLoS Comput Biol. 2010 Jun 24;6(6):e1000823. doi: 10.1371/journal.pcbi.1000823.
3
Efficient sampling of RNA secondary structures from the Boltzmann ensemble of low-energy: the boustrophedon method.
从低能量玻尔兹曼系综中高效采样RNA二级结构:双向换行法
J Math Biol. 2008 Jan;56(1-2):107-27. doi: 10.1007/s00285-007-0137-z. Epub 2007 Oct 12.
4
Query-dependent banding (QDB) for faster RNA similarity searches.用于更快RNA相似性搜索的查询依赖条带法(QDB)。
PLoS Comput Biol. 2007 Mar 30;3(3):e56. doi: 10.1371/journal.pcbi.0030056. Epub 2007 Feb 7.
5
In silico sequence evolution with site-specific interactions along phylogenetic trees.沿系统发育树具有位点特异性相互作用的计算机模拟序列进化。
Bioinformatics. 2006 Mar 15;22(6):716-22. doi: 10.1093/bioinformatics/bti812. Epub 2005 Dec 6.
6
A comparative method for finding and folding RNA secondary structures within protein-coding regions.一种在蛋白质编码区域内寻找和折叠RNA二级结构的比较方法。
Nucleic Acids Res. 2004 Sep 24;32(16):4925-36. doi: 10.1093/nar/gkh839. Print 2004.
7
Investigation of the Bernoulli model for RNA secondary structures.RNA二级结构的伯努利模型研究。
Bull Math Biol. 2004 Sep;66(5):925-64. doi: 10.1016/j.bulm.2003.08.015.
8
An evolutionary model for protein-coding regions with conserved RNA structure.具有保守RNA结构的蛋白质编码区域的进化模型。
Mol Biol Evol. 2004 Oct;21(10):1913-22. doi: 10.1093/molbev/msh199. Epub 2004 Jun 30.
9
Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction.用于RNA二级结构预测的几种轻量级随机上下文无关文法的评估
BMC Bioinformatics. 2004 Jun 4;5:71. doi: 10.1186/1471-2105-5-71.
10
Identifying good predictions of RNA secondary structure.识别RNA二级结构的良好预测结果。
Pac Symp Biocomput. 2004:423-34. doi: 10.1142/9789812704856_0040.