College of Mathematics and Statistics, Shenzhen University, 518000, Guangdong, China.
Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa440.
The developmental process of epithelial-mesenchymal transition (EMT) is abnormally activated during breast cancer metastasis. Transcriptional regulatory networks that control EMT have been well studied; however, alternative RNA splicing plays a vital regulatory role during this process and the regulating mechanism needs further exploration. Because of the huge cost and complexity of biological experiments, the underlying mechanisms of alternative splicing (AS) and associated RNA-binding proteins (RBPs) that regulate the EMT process remain largely unknown. Thus, there is an urgent need to develop computational methods for predicting potential RBP-AS event associations during EMT.
We developed a novel model for RBP-AS target prediction during EMT that is based on inductive matrix completion (RAIMC). Integrated RBP similarities were calculated based on RBP regulating similarity, and RBP Gaussian interaction profile (GIP) kernel similarity, while integrated AS event similarities were computed based on AS event module similarity and AS event GIP kernel similarity. Our primary objective was to complete missing or unknown RBP-AS event associations based on known associations and on integrated RBP and AS event similarities. In this paper, we identify significant RBPs for AS events during EMT and discuss potential regulating mechanisms. Our computational results confirm the effectiveness and superiority of our model over other state-of-the-art methods. Our RAIMC model achieved AUC values of 0.9587 and 0.9765 based on leave-one-out cross-validation (CV) and 5-fold CV, respectively, which are larger than the AUC values from the previous models. RAIMC is a general matrix completion framework that can be adopted to predict associations between other biological entities. We further validated the prediction performance of RAIMC on the genes CD44 and MAP3K7. RAIMC can identify the related regulating RBPs for isoforms of these two genes.
The source code for RAIMC is available at https://github.com/yushanqiu/RAIMC.
zouquan@nclab.net online.
上皮-间充质转化 (EMT) 的发育过程在乳腺癌转移过程中异常激活。控制 EMT 的转录调控网络已经得到了很好的研究;然而,选择性剪接在这个过程中起着至关重要的调节作用,其调节机制仍需要进一步探索。由于生物实验的巨大成本和复杂性,EMT 过程中调节选择性剪接 (AS) 的潜在 RNA 结合蛋白 (RBP) 及其相关机制在很大程度上仍不清楚。因此,迫切需要开发用于预测 EMT 过程中潜在 RBP-AS 事件关联的计算方法。
我们开发了一种新的 EMT 中 RBP-AS 靶标预测模型,该模型基于归纳矩阵补全 (RAIMC)。基于 RBP 调节相似性和 RBP 高斯相互作用分布 (GIP) 核相似性计算整合的 RBP 相似性,而基于 AS 事件模块相似性和 AS 事件 GIP 核相似性计算整合的 AS 事件相似性。我们的主要目标是基于已知关联和整合的 RBP 和 AS 事件相似性来完成缺失或未知的 RBP-AS 事件关联。在本文中,我们确定了 EMT 中 AS 事件的重要 RBPs,并讨论了潜在的调节机制。我们的计算结果证实了我们的模型优于其他最先进方法的有效性和优越性。我们的 RAIMC 模型在基于留一交叉验证 (CV) 和 5 折 CV 的实验中分别获得了 0.9587 和 0.9765 的 AUC 值,均大于以前模型的 AUC 值。RAIMC 是一个通用的矩阵补全框架,可以用于预测其他生物实体之间的关联。我们进一步验证了 RAIMC 在基因 CD44 和 MAP3K7 上的预测性能。RAIMC 可以识别这两个基因的两种异构体的相关调节 RBP。
RAIMC 的源代码可在 https://github.com/yushanqiu/RAIMC 上获得。