Chakrabarti Anob M, Haberman Nejc, Praznik Arne, Luscombe Nicholas M, Ule Jernej
The Francis Crick Institute, London NW1 1AT, United Kingdom.
Department of Genetics, Environment and Evolution, UCL Genetics Institute, University College London, London WC1E 6BT, United Kingdom.
Annu Rev Biomed Data Sci. 2018 Jul 20;1(1):235-261. doi: 10.1146/annurev-biodatasci-080917-013525.
An interplay of experimental and computational methods is required to achieve a comprehensive understanding of protein-RNA interactions. UV crosslinking and immunoprecipitation (CLIP) identifies endogenous interactions by sequencing RNA fragments that copurify with a selected RNA-binding protein under stringent conditions. Here we focus on approaches for the analysis of the resulting data and appraise the methods for peak calling, visualization, analysis, and computational modeling of protein-RNA binding sites. We advocate that the sensitivity and specificity of data be assessed in combination for computational quality control. Moreover, we demonstrate the value of analyzing sequence motif enrichment in peaks assigned from CLIP data and of visualizing RNA maps, which examine the positional distribution of peaks around regulated landmarks in transcripts. We use these to assess how variations in CLIP data quality and in different peak calling methods affect the insights into regulatory mechanisms. We conclude by discussing future opportunities for the computational analysis of protein-RNA interaction experiments.
要全面理解蛋白质与RNA的相互作用,需要实验方法和计算方法相互配合。紫外线交联免疫沉淀法(CLIP)通过对在严格条件下与选定RNA结合蛋白共纯化的RNA片段进行测序,来识别内源性相互作用。在这里,我们重点关注所得数据分析方法,并评估蛋白质-RNA结合位点的峰识别、可视化、分析和计算建模方法。我们主张结合评估数据的敏感性和特异性以进行计算质量控制。此外,我们展示了分析CLIP数据分配峰中的序列基序富集以及可视化RNA图谱的价值,RNA图谱可检查转录本中调控标记周围峰的位置分布。我们利用这些来评估CLIP数据质量的变化以及不同峰识别方法如何影响对调控机制的深入理解。我们通过讨论蛋白质-RNA相互作用实验计算分析的未来机遇来结束本文。