Department of Electrical Engineering, National Chiayi University, Chiayi, Taiwan.
MindtronicAI Co., Ltd., 7F, No. 218, Sec. 6, Roosevelt Rd., 24105, Taipei, Taiwan.
BMC Bioinformatics. 2023 Mar 28;24(1):122. doi: 10.1186/s12859-023-05238-8.
As the RNA secondary structure is highly related to its stability and functions, the structure prediction is of great value to biological research. The traditional computational prediction for RNA secondary prediction is mainly based on the thermodynamic model with dynamic programming to find the optimal structure. However, the prediction performance based on the traditional approach is unsatisfactory for further research. Besides, the computational complexity of the structure prediction using dynamic programming is [Formula: see text]; it becomes [Formula: see text] for RNA structure with pseudoknots, which is computationally impractical for large-scale analysis.
In this paper, we propose REDfold, a novel deep learning-based method for RNA secondary prediction. REDfold utilizes an encoder-decoder network based on CNN to learn the short and long range dependencies among the RNA sequence, and the network is further integrated with symmetric skip connections to efficiently propagate activation information across layers. Moreover, the network output is post-processed with constrained optimization to yield favorable predictions even for RNAs with pseudoknots. Experimental results based on the ncRNA database demonstrate that REDfold achieves better performance in terms of efficiency and accuracy, outperforming the contemporary state-of-the-art methods.
由于 RNA 的二级结构与其稳定性和功能高度相关,因此结构预测对生物研究具有重要价值。传统的 RNA 二级预测计算预测主要基于动态规划的热力学模型,以找到最优结构。然而,基于传统方法的预测性能不能满足进一步的研究需求。此外,使用动态规划进行结构预测的计算复杂度为 [Formula: see text];对于具有假结的 RNA 结构,计算复杂度变为 [Formula: see text],这对于大规模分析来说是不切实际的。
在本文中,我们提出了 REDfold,这是一种用于 RNA 二级预测的新型基于深度学习的方法。REDfold 利用基于 CNN 的编码器-解码器网络来学习 RNA 序列中的短程和长程依赖关系,并且该网络进一步与对称跳过连接集成,以有效地在层之间传播激活信息。此外,通过约束优化对网络输出进行后处理,即使对于具有假结的 RNA 也能产生有利的预测。基于 ncRNA 数据库的实验结果表明,REDfold 在效率和准确性方面都具有更好的性能,优于当代最先进的方法。