Suppr超能文献

SNN6mA:基于孪生网络的特征嵌入提高 DNA N6-甲基腺嘌呤位点预测。

SNN6mA: Improved DNA N6-methyladenine site prediction using Siamese network-based feature embedding.

机构信息

Glasgow College, University of Electronic Science and Technology of China, Chengdu, 611731, China.

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.

出版信息

Comput Biol Med. 2023 Nov;166:107533. doi: 10.1016/j.compbiomed.2023.107533. Epub 2023 Sep 27.

Abstract

DNA N6-methyladenine (6mA) is one of the most common and abundant modifications, which plays essential roles in various biological processes and cellular functions. Therefore, the accurate identification of DNA 6mA sites is of great importance for a better understanding of its regulatory mechanisms and biological functions. Although significant progress has been made, there still has room for further improvement in 6mA site prediction in DNA sequences. In this study, we report a smart but accurate 6mA predictor, termed as SNN6mA, using Siamese network. To be specific, DNA segments are firstly encoded into feature vectors using the one-hot encoding scheme; then, these original feature vectors are mapped to a low-dimensional embedding space derived from Siamese network to capture more discriminative features; finally, the obtained low-dimensional features are fed to a fully connected neural network to perform final prediction. Stringent benchmarking tests on the datasets of two species demonstrated that the proposed SNN6mA is superior to the state-of-the-art 6mA predictors. Detailed data analyses show that the major advantage of SNN6mA lies in the utilization of Siamese network, which can map the original features into a low-dimensional embedding space with more discriminative capability. In summary, the proposed SNN6mA is the first attempt to use Siamese network for 6mA site prediction and could be easily extended to predict other types of modifications. The codes and datasets used in the study are freely available at https://github.com/YuXuan-Glasgow/SNN6mA for academic use.

摘要

DNA N6-甲基腺嘌呤(6mA)是最常见和丰富的修饰之一,在各种生物过程和细胞功能中发挥着重要作用。因此,准确识别 DNA 6mA 位点对于更好地理解其调控机制和生物学功能至关重要。尽管已经取得了重大进展,但在 DNA 序列中 6mA 位点的预测方面仍有进一步改进的空间。在这项研究中,我们使用 Siamese 网络报告了一种智能但准确的 6mA 预测器,称为 SNN6mA。具体来说,首先使用独热编码方案将 DNA 片段编码为特征向量;然后,这些原始特征向量被映射到来自 Siamese 网络的低维嵌入空间,以捕获更具鉴别力的特征;最后,将获得的低维特征输入全连接神经网络进行最终预测。在两个物种的数据集上进行的严格基准测试表明,所提出的 SNN6mA 优于最先进的 6mA 预测器。详细的数据分析表明,SNN6mA 的主要优势在于 Siamese 网络的利用,它可以将原始特征映射到具有更高鉴别能力的低维嵌入空间。总之,这是首次尝试使用 Siamese 网络进行 6mA 位点预测,并可轻松扩展到预测其他类型的修饰。该研究中使用的代码和数据集可在 https://github.com/YuXuan-Glasgow/SNN6mA 上免费获取,供学术使用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验