Tarafder Sumit, Bhattacharya Debswapna
Department of Computer Science, Virginia Tech, Blacksburg, Virginia, 24061, USA.
bioRxiv. 2024 Jul 11:2023.11.04.565599. doi: 10.1101/2023.11.04.565599.
A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently-available machine learning-based approaches. Here we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root mean square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.
在缺乏实验结构的情况下,一种能够可靠评估三维RNA结构模型准确性的评分函数不仅对模型评估和选择很重要,而且对基于评分的构象采样也很有用。然而,事实证明,使用传统的基于知识的统计势和当前可用的基于机器学习的方法进行高保真RNA评分是困难的。在此,我们提出了lociPARSE,一种用于对RNA三维结构进行评分的局部感知不变点注意力架构。与现有的估计基于叠加的均方根偏差(RMSD)的机器学习方法不同,lociPARSE在聚合信息以预测全局结构准确性之前,以无叠加的方式估计局部距离差异测试(lDDT)分数,该分数捕获每个核苷酸及其周围局部原子环境的准确性。在包括CASP15在内的多个数据集上进行测试时,lociPARSE在互补评估指标上显著优于现有的统计势(rsRNASP、cgRNASP、DFIRE-RNA和RASP)和机器学习方法(ARES和RNA3DCNN)。lociPARSE可在https://github.com/Bhattacharya-Lab/lociPARSE上免费获取。