一种用于RNA二级结构预测的统计抽样算法。

A statistical sampling algorithm for RNA secondary structure prediction.

作者信息

Ding Ye, Lawrence Charles E

机构信息

Bioinformatics Center, Wadsworth Center, New York State Department of Health, 150 New Scotland Avenue, Albany, NY 12208, USA.

出版信息

Nucleic Acids Res. 2003 Dec 15;31(24):7280-301. doi: 10.1093/nar/gkg938.

DOI:10.1093/nar/gkg938

PMID:14654704

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC297010/

Abstract

An RNA molecule, particularly a long-chain mRNA, may exist as a population of structures. Further more, multiple structures have been demonstrated to play important functional roles. Thus, a representation of the ensemble of probable structures is of interest. We present a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures. The forward step of the algorithm computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters. Using conditional probabilities computed with the partition functions in a recursive sampling process, the backward step of the algorithm quickly generates a statistically representative sample of structures. With cubic run time for the forward step, quadratic run time in the worst case for the sampling step, and quadratic storage, the algorithm is efficient for broad applicability. We demonstrate that, by classifying sampled structures, the algorithm enables a statistical delineation and representation of the Boltzmann ensemble. Applications of the algorithm show that alternative biological structures are revealed through sampling. Statistical sampling provides a means to estimate the probability of any structural motif, with or without constraints. For example, the algorithm enables probability profiling of single-stranded regions in RNA secondary structure. Probability profiling for specific loop types is also illustrated. By overlaying probability profiles, a mutual accessibility plot can be displayed for predicting RNA:RNA interactions. Boltzmann probability-weighted density of states and free energy distributions of sampled structures can be readily computed. We show that a sample of moderate size from the ensemble of an enormous number of possible structures is sufficient to guarantee statistical reproducibility in the estimates of typical sampling statistics. Our applications suggest that the sampling algorithm may be well suited to prediction of mRNA structure and target accessibility. The algorithm is applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies.

摘要

RNA分子，尤其是长链mRNA，可能以多种结构形式存在。此外，已证明多种结构发挥着重要的功能作用。因此，对可能结构的集合进行表征很有意义。我们提出了一种统计算法，用于从二级结构的玻尔兹曼系综中进行严格且精确的采样。该算法的前向步骤使用最新的热力学参数计算RNA二级结构的平衡配分函数。在递归采样过程中，利用配分函数计算的条件概率，算法的后向步骤快速生成具有统计代表性的结构样本。前向步骤的运行时间为立方级，采样步骤在最坏情况下的运行时间为二次级，且存储为二次级，该算法具有广泛的适用性且效率较高。我们证明，通过对采样结构进行分类，该算法能够对玻尔兹曼系综进行统计描述和表征。该算法的应用表明，通过采样可以揭示替代的生物学结构。统计采样提供了一种估计任何结构基序概率的方法，无论有无约束条件。例如，该算法能够对RNA二级结构中的单链区域进行概率分析。还展示了特定环类型的概率分析。通过叠加概率分布图，可以显示相互可及性图以预测RNA:RNA相互作用。可以很容易地计算采样结构的玻尔兹曼概率加权态密度和自由能分布。我们表明，从大量可能结构的系综中抽取的中等规模样本足以保证典型采样统计估计中的统计可重复性。我们的应用表明，该采样算法可能非常适合预测mRNA结构和靶标可及性。该算法适用于基因敲降研究中用于合理设计小干扰RNA（siRNA）以及反义寡核苷酸和反式切割核酶。

相似文献

A statistical sampling algorithm for RNA secondary structure prediction.

Nucleic Acids Res. 2003 Dec 15;31(24):7280-301. doi: 10.1093/nar/gkg938.

Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures.

BMC Bioinformatics. 2012 Jul 9;13:159. doi: 10.1186/1471-2105-13-159.

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model.

J Comput Biol. 2007 Mar;14(2):190-215. doi: 10.1089/cmb.2006.0012.

Sfold web server for statistical folding and rational design of nucleic acids.

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W135-41. doi: 10.1093/nar/gkh449.

Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures.

RNA. 2016 Dec;22(12):1808-1818. doi: 10.1261/rna.053694.115. Epub 2016 Oct 19.

Statistical prediction of single-stranded regions in RNA secondary structure and application to predicting effective antisense target sites and beyond.

Nucleic Acids Res. 2001 Mar 1;29(5):1034-46. doi: 10.1093/nar/29.5.1034.

A bayesian statistical algorithm for RNA secondary structure prediction.

Comput Chem. 1999 Jun 15;23(3-4):387-400. doi: 10.1016/s0097-8485(99)00010-8.

Evaluation of a sophisticated SCFG design for RNA secondary structure prediction.

Theory Biosci. 2011 Dec;130(4):313-36. doi: 10.1007/s12064-011-0139-7. Epub 2011 Dec 2.

An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure.

J Comput Biol. 2011 Nov;18(11):1465-79. doi: 10.1089/cmb.2011.0181. Epub 2011 Oct 28.

Efficient algorithms for probing the RNA mutation landscape.

PLoS Comput Biol. 2008 Aug 8;4(8):e1000124. doi: 10.1371/journal.pcbi.1000124.

引用本文的文献

mRNA folding algorithms for structure and codon optimization.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf386.

Modeling RNA duplex dynamics with Gibbs sampling enhances base-pair prediction accuracy and reveals structural activity profiles.

NAR Genom Bioinform. 2025 Jul 17;7(3):lqaf099. doi: 10.1093/nargab/lqaf099. eCollection 2025 Sep.

Measuring intramolecular connectivity in long RNA molecules using two-dimensional DNA patch-probe arrays.

Nucleic Acids Res. 2025 Jun 6;53(11). doi: 10.1093/nar/gkaf469.

Transformers in RNA structure prediction: A review.

Comput Struct Biotechnol J. 2025 Mar 17;27:1187-1203. doi: 10.1016/j.csbj.2025.03.021. eCollection 2025.

Calmodulin1-Calmodulin Binding Transcription Activator (CAM1-CAMTA) negatively regulate the transcription of Fluoride Export Gene 1 (FEX1) to mediate fluoride transport in tea (Camellia sinensis).

J Exp Bot. 2025 Jul 2;76(10):2715-2726. doi: 10.1093/jxb/eraf113.

IPANEMAP Suite: a pipeline for probing-informed RNA structure modeling.

NAR Genom Bioinform. 2025 Mar 25;7(1):lqaf028. doi: 10.1093/nargab/lqaf028. eCollection 2025 Mar.

Two leucine-rich repeat receptor-like kinases initiate herbivory defense responses in tea plants.

Hortic Res. 2024 Oct 2;12(1):uhae281. doi: 10.1093/hr/uhae281. eCollection 2025 Jan.

CParty: hierarchically constrained partition function of RNA pseudoknots.

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae748.

mRNA-miRNA analyses reveal the involvement of CsbHLH1 and miR1446a in the regulation of caffeine biosynthesis in .

Hortic Res. 2023 Dec 29;11(2):uhad282. doi: 10.1093/hr/uhad282. eCollection 2024 Feb.

LncRNA81246 regulates resistance against tea leaf spot by interrupting the miR164d-mediated degradation of NAC1.

Plant J. 2025 Jan;121(1):e17173. doi: 10.1111/tpj.17173. Epub 2024 Nov 26.

本文引用的文献

The structural basis of large ribosomal subunit function.

Annu Rev Biochem. 2003;72:813-50. doi: 10.1146/annurev.biochem.72.110601.135450.

The activity of siRNA in mammalian cells is related to structural target accessibility: a comparison with antisense oligonucleotides.

Nucleic Acids Res. 2003 Aug 1;31(15):4417-24. doi: 10.1093/nar/gkg649.

The efficacy of small interfering RNAs targeted to the type 1 insulin-like growth factor receptor (IGF1R) is influenced by secondary structure in the IGF1R transcript.

J Biol Chem. 2003 May 2;278(18):15991-7. doi: 10.1074/jbc.M300714200. Epub 2003 Feb 24.

Efficient reduction of target RNAs by small interfering RNA and RNase H-dependent antisense agents. A comparative analysis.

J Biol Chem. 2003 Feb 28;278(9):7108-18. doi: 10.1074/jbc.M210326200. Epub 2002 Dec 23.

Expression of small interfering RNAs targeted against HIV-1 rev transcripts in human cells.

Nat Biotechnol. 2002 May;20(5):500-5. doi: 10.1038/nbt0502-500.

Statistical prediction of single-stranded regions in RNA secondary structure and application to predicting effective antisense target sites and beyond.

Nucleic Acids Res. 2001 Mar 1;29(5):1034-46. doi: 10.1093/nar/29.5.1034.

Selecting optimal antisense reagents.

Adv Drug Deliv Rev. 2000 Oct 31;44(1):23-34. doi: 10.1016/s0169-409x(00)00081-8.

Calculating nucleic acid secondary structure.

Curr Opin Struct Biol. 2000 Jun;10(3):303-10. doi: 10.1016/s0959-440x(00)00088-9.

A bayesian statistical algorithm for RNA secondary structure prediction.

Comput Chem. 1999 Jun 15;23(3-4):387-400. doi: 10.1016/s0097-8485(99)00010-8.

Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

J Mol Biol. 1999 May 21;288(5):911-40. doi: 10.1006/jmbi.1999.2700.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于RNA二级结构预测的统计抽样算法。

A statistical sampling algorithm for RNA secondary structure prediction.

作者信息

Ding Ye, Lawrence Charles E

机构信息

Bioinformatics Center, Wadsworth Center, New York State Department of Health, 150 New Scotland Avenue, Albany, NY 12208, USA.

出版信息

Nucleic Acids Res. 2003 Dec 15;31(24):7280-301. doi: 10.1093/nar/gkg938.

DOI:10.1093/nar/gkg938

PMID:14654704

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC297010/

Abstract

摘要

一种用于RNA二级结构预测的统计抽样算法。

A statistical sampling algorithm for RNA secondary structure prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种用于RNA二级结构预测的统计抽样算法。

A statistical sampling algorithm for RNA secondary structure prediction.

作者信息

机构信息

出版信息