Alsenan Shrooq, Al-Turaiki Isra, Aldayel Mashael, Tounsi Mohamed
Information Systems Department, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11653, Saudi Arabia.
Curr Issues Mol Biol. 2024 Feb 4;46(2):1360-1373. doi: 10.3390/cimb46020087.
RNA-binding proteins (RBPs) play an important role in regulating biological processes, such as gene regulation. Understanding their behaviors, for example, their binding site, can be helpful in understanding RBP-related diseases. Studies have focused on predicting RNA binding by means of machine learning algorithms including deep convolutional neural network models. One of the integral parts of modeling deep learning is achieving optimal hyperparameter tuning and minimizing a loss function using optimization algorithms. In this paper, we investigate the role of optimization in the RBP classification problem using the CLIP-Seq 21 dataset. Three optimization methods are employed on the RNA-protein binding CNN prediction model; namely, grid search, random search, and Bayesian optimizer. The empirical results show an AUC of 94.42%, 93.78%, 93.23% and 92.68% on the ELAVL1C, ELAVL1B, ELAVL1A, and HNRNPC datasets, respectively, and a mean AUC of 85.30 on 24 datasets. This paper's findings provide evidence on the role of optimizers in improving the performance of RNA-protein binding prediction.
RNA结合蛋白(RBPs)在调节生物过程(如基因调控)中发挥着重要作用。了解它们的行为,例如它们的结合位点,有助于理解与RBP相关的疾病。研究集中于通过包括深度卷积神经网络模型在内的机器学习算法来预测RNA结合。深度学习建模的一个不可或缺的部分是实现最优超参数调整,并使用优化算法最小化损失函数。在本文中,我们使用CLIP-Seq 21数据集研究优化在RBP分类问题中的作用。在RNA-蛋白质结合CNN预测模型上采用了三种优化方法;即网格搜索、随机搜索和贝叶斯优化器。实证结果表明,在ELAVL1C、ELAVL1B、ELAVL1A和HNRNPC数据集上的AUC分别为94.42%、93.78%、93.23%和92.68%,在24个数据集上的平均AUC为85.30。本文的研究结果为优化器在提高RNA-蛋白质结合预测性能方面的作用提供了证据。