Zhang Yuntian, Yao Lantian, Chung Chia-Ru, Huang Yixian, Li Shangfu, Zhang Wenyang, Pang Yuxuan, Lee Tzong-Yi
Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China.
School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China.
iScience. 2024 Feb 28;27(4):109333. doi: 10.1016/j.isci.2024.109333. eCollection 2024 Apr 19.
Kinases as important enzymes can transfer phosphate groups from high-energy and phosphate-donating molecules to specific substrates and play essential roles in various cellular processes. Existing algorithms for kinase activity from phosphorylated proteomics data are often costly, requiring valuable samples. Moreover, methods to extract kinase activities from bulk RNA sequencing data remain undeveloped. In this study, we propose a computational framework KinPred-RNA to derive kinase activities from bulk RNA-sequencing data in cancer samples. KinPred-RNA framework, using the extreme gradient boosting (XGBoost) regression model, outperforms random forest regression, multiple linear regression, and support vector machine regression models in predicting kinase activities from cancer-related RNA sequencing data. Efficient gene signatures from the LINCS-L1000 dataset were used as inputs for KinPred-RNA. The results highlight its potential to be related to biological function. In conclusion, KinPred RNA constitutes a significant advance in cancer research by potentially facilitating the identification of cancer.
激酶作为重要的酶,能够将磷酸基团从高能供磷分子转移至特定底物,并在各种细胞过程中发挥关键作用。现有的从磷酸化蛋白质组学数据推断激酶活性的算法通常成本高昂,需要珍贵的样本。此外,从批量RNA测序数据中提取激酶活性的方法仍未得到开发。在本研究中,我们提出了一个计算框架KinPred-RNA,用于从癌症样本的批量RNA测序数据中推导激酶活性。KinPred-RNA框架使用极端梯度提升(XGBoost)回归模型,在从癌症相关RNA测序数据预测激酶活性方面优于随机森林回归、多元线性回归和支持向量机回归模型。来自LINCS-L1000数据集的高效基因特征被用作KinPred-RNA的输入。结果突出了其与生物学功能相关的潜力。总之,KinPred RNA通过潜在地促进癌症识别,在癌症研究中取得了重大进展。