Wuhan University of Science and Technology, Wuhan, Hubei, China.
China Three Gorges University, Yichang, Hubei, China.
PeerJ. 2023 Sep 25;11:e16125. doi: 10.7717/peerj.16125. eCollection 2023.
DNA methylation is a crucial topic in bioinformatics research. Traditional wet experiments are usually time-consuming and expensive. In contrast, machine learning offers an efficient and novel approach. In this study, we propose DeepMethylation, a novel methylation predictor with deep learning. Specifically, the DNA sequence is encoded with word embedding and GloVe in the first step. After that, dilated convolution and Transformer encoder are utilized to extract the features. Finally, full connection and softmax operators are applied to predict the methylation sites. The proposed model achieves an accuracy of 97.8% on the 5mC dataset, which outperforms state-of-the-art methods. Furthermore, our predictor exhibits good generalization ability as it achieves an accuracy of 95.8% on the m1A dataset. To ease access for other researchers, our code is publicly available at https://github.com/sb111169/tf-5mc.
DNA 甲基化是生物信息学研究中的一个重要课题。传统的湿实验通常既耗时又昂贵。相比之下,机器学习提供了一种高效而新颖的方法。在这项研究中,我们提出了 DeepMethylation,这是一种基于深度学习的新型甲基化预测器。具体来说,在第一步中,DNA 序列通过词嵌入和 GloVe 进行编码。之后,使用扩张卷积和 Transformer 编码器提取特征。最后,应用全连接和 softmax 操作符来预测甲基化位点。在所提出的模型中,在 5mC 数据集上的准确率达到了 97.8%,优于最先进的方法。此外,我们的预测器表现出良好的泛化能力,在 m1A 数据集上的准确率达到了 95.8%。为了方便其他研究人员使用,我们的代码在 https://github.com/sb111169/tf-5mc 上公开可用。