Wang Yan, Guo Rui, Huang Lan, Yang Sen, Hu Xuemei, He Kai
Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China.
School of Artificial Intelligence, Jilin University, Changchun, China.
Front Genet. 2021 May 27;12:670852. doi: 10.3389/fgene.2021.670852. eCollection 2021.
N-methyladenosine (mA) is one of the most prevalent RNA post-transcriptional modifications and is involved in various vital biological processes such as mRNA splicing, exporting, stability, and so on. Identifying mA sites contributes to understanding the functional mechanism and biological significance of mA. The existing biological experimental methods for identifying mA sites are time-consuming and costly. Thus, developing a high confidence computational method is significant to explore mA intrinsic characters. In this study, we propose a predictor called m6AGE which utilizes sequence-derived and graph embedding features. To the best of our knowledge, our predictor is the first to combine sequence-derived features and graph embeddings for mA site prediction. Comparison results show that our proposed predictor achieved the best performance compared with other predictors on four public datasets across three species. On the dataset, our predictor outperformed 1.34% (accuracy), 0.0227 (Matthew's correlation coefficient), 5.63% (specificity), and 0.0081 (AUC) than comparing predictors, which indicates that m6AGE is a useful tool for mA site prediction. The source code of m6AGE is available at https://github.com/bokunoBike/m6AGE.
N-甲基腺苷(mA)是最普遍的RNA转录后修饰之一,参与多种重要的生物学过程,如mRNA剪接、输出、稳定性等。识别mA位点有助于理解mA的功能机制和生物学意义。现有的识别mA位点的生物学实验方法既耗时又昂贵。因此,开发一种高可信度的计算方法对于探索mA的内在特征具有重要意义。在本研究中,我们提出了一种名为m6AGE的预测器,它利用序列衍生特征和图嵌入特征。据我们所知,我们的预测器是首个将序列衍生特征和图嵌入相结合用于mA位点预测的。比较结果表明,在三个物种的四个公共数据集上,我们提出的预测器与其他预测器相比表现最佳。在该数据集上,与比较的预测器相比,我们的预测器在准确率上高出1.34%,马修斯相关系数高出0.0227,特异性高出5.63%,曲线下面积高出0.0081,这表明m6AGE是一种用于mA位点预测的有用工具。m6AGE的源代码可在https://github.com/bokunoBike/m6AGE获取。