Yang Yuedong, Zhao Huiying, Wang Jihua, Zhou Yaoqi
School of Informatics, Indiana University Purdue University, Indianapolis, IN, USA.
Methods Mol Biol. 2014;1137:119-30. doi: 10.1007/978-1-4939-0366-5_9.
RNA-binding proteins (RBPs) play key roles in RNA metabolism and post-transcriptional regulation. Computational methods have been developed separately for prediction of RBPs and RNA-binding residues by machine-learning techniques and prediction of protein-RNA complex structures by rigid or semiflexible structure-to-structure docking. Here, we describe a template-based technique called SPOT-Seq-RNA that integrates prediction of RBPs, RNA-binding residues, and protein-RNA complex structures into a single package. This integration is achieved by combining template-based structure-prediction software, SPARKS X, with binding affinity prediction software, DRNA. This tool yields reasonable sensitivity (46 %) and high precision (84 %) for an independent test set of 215 RBPs and 5,766 non-RBPs. SPOT-Seq-RNA is computationally efficient for genome-scale prediction of RBPs and protein-RNA complex structures. Its application to human genome study has revealed a similar sensitivity and ability to uncover hundreds of novel RBPs beyond simple homology. The online server and downloadable version of SPOT-Seq-RNA are available at http://sparks-lab.org/server/SPOT-Seq-RNA/.
RNA结合蛋白(RBPs)在RNA代谢和转录后调控中发挥着关键作用。人们已分别开发了多种计算方法,通过机器学习技术预测RBPs和RNA结合残基,并通过刚性或半柔性结构对结构对接预测蛋白质-RNA复合物结构。在此,我们描述了一种基于模板的技术,称为SPOT-Seq-RNA,它将RBPs、RNA结合残基和蛋白质-RNA复合物结构的预测整合到一个软件包中。这种整合是通过将基于模板的结构预测软件SPARKS X与结合亲和力预测软件DRNA相结合来实现的。对于由215个RBPs和5766个非RBPs组成的独立测试集,该工具具有合理的灵敏度(46%)和高精度(84%)。SPOT-Seq-RNA在计算上对于RBPs和蛋白质-RNA复合物结构的全基因组规模预测是高效的。它在人类基因组研究中的应用显示出类似的灵敏度,并且有能力发现数百种超越简单同源性的新型RBPs。SPOT-Seq-RNA的在线服务器和可下载版本可在http://sparks-lab.org/server/SPOT-Seq-RNA/获取。