Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia.
Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.
Bioinformatics. 2022 Aug 10;38(16):3900-3910. doi: 10.1093/bioinformatics/btac421.
Recently, AlphaFold2 achieved high experimental accuracy for the majority of proteins in Critical Assessment of Structure Prediction (CASP 14). This raises the hope that one day, we may achieve the same feat for RNA structure prediction for those structured RNAs, which is as fundamentally and practically important similar to protein structure prediction. One major factor in the recent advancement of protein structure prediction is the highly accurate prediction of distance-based contact maps of proteins.
Here, we showed that by integrated deep learning with physics-inferred secondary structures, co-evolutionary information and multiple sequence-alignment sampling, we can achieve RNA contact-map prediction at a level of accuracy similar to that in protein contact-map prediction. More importantly, highly accurate prediction for top L long-range contacts can be assured for those RNAs with a high effective number of homologous sequences (Neff > 50). The initial use of the predicted contact map as distance-based restraints confirmed its usefulness in 3D structure prediction.
SPOT-RNA-2D is available as a web server at https://sparks-lab.org/server/spot-rna-2d/ and as a standalone program at https://github.com/jaswindersingh2/SPOT-RNA-2D.
Supplementary data are available at Bioinformatics online.
最近,AlphaFold2 实现了对结构预测关键评估 (CASP14) 中大多数蛋白质的高实验精度。这让人希望有一天,我们也许可以为那些具有类似蛋白质结构预测同等重要性和实用性的 RNA 结构预测实现这一壮举。推动蛋白质结构预测近期进展的一个主要因素是高度准确地预测蛋白质的基于距离的接触图。
在这里,我们通过将深度学习与物理推断的二级结构、共进化信息和多序列比对采样相结合,表明我们可以实现 RNA 接触图预测,其精度与蛋白质接触图预测相当。更重要的是,对于有效同源序列数 (Neff > 50) 较高的 RNA,可以保证对 top L 长程接触的高度准确预测。将预测的接触图作为基于距离的约束条件的初步使用证实了其在 3D 结构预测中的有用性。
SPOT-RNA-2D 可作为网络服务器在 https://sparks-lab.org/server/spot-rna-2d/ 使用,也可作为独立程序在 https://github.com/jaswindersingh2/SPOT-RNA-2D 使用。
补充数据可在生物信息学在线获得。