Chasles Simon, Major François
Department of Computer Science and Operations Research, and Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Québec H3C 3J7, Canada.
NAR Genom Bioinform. 2025 Jul 17;7(3):lqaf099. doi: 10.1093/nargab/lqaf099. eCollection 2025 Sep.
The RNA secondary (2D) structure prediction problem consists in determining the set of base pairs that form within an RNA molecule from its sequence. A related task is the RNA hybridization problem, where two RNA strands interact to form a duplex. Thermodynamics-based methods typically rely on experimentally determined energy parameters to compute minimum free energy structures for both single-stranded RNAs and duplexes. Through the Boltzmann distribution, these parameters can be used to estimate base-pairing probabilities. Here, we leverage these probabilities to simulate RNA:RNA interaction dynamics. Inspired by the Ising model, we apply Gibbs sampling to model the stochastic formation and disruption of base pairs over time in RNA duplexes, ultimately deriving a consensus structure. The resulting method, MC-DuplexFold (mcdf), enhances base-pair prediction accuracy when integrated with other RNA 2D structure prediction algorithms. Through benchmarking, we reaffirm the previously observed trend that approximate or heuristic methods, such as RIsearch and Sfold, outperform exact methods like RNAcofold and DuplexFold in structural prediction accuracy. Additionally, mcdf provides structural activity statistics that can be incorporated into the modeling of miRNA primary transcripts, precursors, and target interactions, thereby refining predictions of miRNA:mRNA duplex dynamics.
RNA二级(2D)结构预测问题在于根据RNA分子的序列确定其内部形成的碱基对集合。一个相关任务是RNA杂交问题,即两条RNA链相互作用形成双链体。基于热力学的方法通常依赖于实验确定的能量参数来计算单链RNA和双链体的最小自由能结构。通过玻尔兹曼分布,这些参数可用于估计碱基配对概率。在此,我们利用这些概率来模拟RNA:RNA相互作用动力学。受伊辛模型启发,我们应用吉布斯采样来模拟RNA双链体中碱基对随时间的随机形成和破坏,最终得出一致结构。所得方法MC-DuplexFold(mcdf)与其他RNA 2D结构预测算法结合使用时,可提高碱基对预测准确性。通过基准测试,我们再次证实了之前观察到的趋势,即在结构预测准确性方面,近似或启发式方法(如RIsearch和Sfold)优于RNAcofold和DuplexFold等精确方法。此外,mcdf提供结构活性统计信息,可纳入miRNA初级转录本、前体和靶标相互作用的建模中,从而完善对miRNA:mRNA双链体动力学的预测。