Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
Department of Biostatistics, Harvard University, Cambridge, MA, USA.
Methods Mol Biol. 2023;2586:197-215. doi: 10.1007/978-1-0716-2768-6_12.
Deep neural networks have demonstrated improved performance at predicting sequence specificities of DNA- and RNA-binding proteins. However, it remains unclear why they perform better than previous methods that rely on k-mers and position weight matrices. Here, we highlight a recent deep learning-based software package, called ResidualBind, that analyzes RNA-protein interactions using only RNA sequence as an input feature and performs global importance analysis for model interpretability. We discuss practical considerations for model interpretability to uncover learned sequence motifs and their secondary structure preferences.
深度神经网络在预测 DNA 和 RNA 结合蛋白的序列特异性方面表现出了改进的性能。然而,目前尚不清楚为什么它们的表现优于以前依赖于 k-mer 和位置权重矩阵的方法。在这里,我们重点介绍一个最近基于深度学习的软件包,称为 ResidualBind,它仅使用 RNA 序列作为输入特征来分析 RNA-蛋白质相互作用,并执行全局重要性分析以实现模型的可解释性。我们讨论了模型可解释性的实际考虑因素,以揭示学习到的序列基序及其二级结构偏好。