• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MSNet-4mC:学习用于识别DNA N4-甲基胞嘧啶位点的有效多尺度表示。

MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites.

作者信息

Liu Chunting, Song Jiangning, Ogata Hiroyuki, Akutsu Tatsuya

机构信息

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto, Kyoto 606-8501, Japan.

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan.

出版信息

Bioinformatics. 2022 Nov 30;38(23):5160-5167. doi: 10.1093/bioinformatics/btac671.

DOI:10.1093/bioinformatics/btac671
PMID:36205602
Abstract

MOTIVATION

N4-methylcytosine (4mC) is an essential kind of epigenetic modification that regulates a wide range of biological processes. However, experimental methods for detecting 4mC sites are time-consuming and labor-intensive. As an alternative, computational methods that are capable of automatically identifying 4mC with data analysis techniques become a reasonable option. A major challenge is how to develop effective methods to fully exploit the complex interactions within the DNA sequences to improve the predictive capability.

RESULTS

In this work, we propose MSNet-4mC, a lightweight neural network building upon convolutional operations with multi-scale receptive fields to perceive cross-element relationships over both short and long ranges of given DNA sequences. With strong imbalances in the number of candidates in different species in mind, we compute and apply class weights in the cross-entropy loss to balance the training process. Extensive benchmarking experiments show that our method achieves a significant performance improvement and outperforms other state-of-the-art methods.

AVAILABILITY AND IMPLEMENTATION

The source code and models are freely available for download at https://github.com/LIU-CT/MSNet-4mC, implemented in Python and supported on Linux and Windows.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

N4-甲基胞嘧啶(4mC)是一种重要的表观遗传修饰,可调节广泛的生物过程。然而,检测4mC位点的实验方法既耗时又费力。作为一种替代方法,能够通过数据分析技术自动识别4mC的计算方法成为一个合理的选择。一个主要挑战是如何开发有效的方法来充分利用DNA序列中的复杂相互作用,以提高预测能力。

结果

在这项工作中,我们提出了MSNet-4mC,这是一种轻量级神经网络,基于具有多尺度感受野的卷积操作构建,以感知给定DNA序列的短程和长程上的跨元素关系。考虑到不同物种中候选数量的强烈不平衡,我们在交叉熵损失中计算并应用类别权重以平衡训练过程。广泛的基准实验表明,我们的方法实现了显著的性能提升,优于其他现有最先进的方法。

可用性和实现

源代码和模型可在https://github.com/LIU-CT/MSNet-4mC上免费下载,用Python实现,支持Linux和Windows系统。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites.MSNet-4mC:学习用于识别DNA N4-甲基胞嘧啶位点的有效多尺度表示。
Bioinformatics. 2022 Nov 30;38(23):5160-5167. doi: 10.1093/bioinformatics/btac671.
2
DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites.DeepTorrent:一种基于深度学习的方法,用于预测 DNA N4-甲基胞嘧啶位点。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa124.
3
4mC-CGRU: Identification of N4-Methylcytosine (4mC) sites using convolution gated recurrent unit in Rosaceae genome.利用卷积门控循环单元在蔷薇科基因组中识别 N4-甲基胞嘧啶(4mC)位点。
Comput Biol Chem. 2023 Dec;107:107974. doi: 10.1016/j.compbiolchem.2023.107974. Epub 2023 Oct 30.
4
Computational identification of N4-methylcytosine sites in the mouse genome with machine-learning method.利用机器学习方法对小鼠基因组中N4-甲基胞嘧啶位点进行计算识别。
Math Biosci Eng. 2021 Apr 15;18(4):3348-3363. doi: 10.3934/mbe.2021167.
5
DNA4mC-LIP: a linear integration method to identify N4-methylcytosine site in multiple species.DNA4mC-LIP:一种在多个物种中鉴定 N4-甲基胞嘧啶位点的线性整合方法。
Bioinformatics. 2020 Jun 1;36(11):3327-3335. doi: 10.1093/bioinformatics/btaa143.
6
Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.Deep4mC:通过深度学习对 DNA N4-甲基胞嘧啶位点进行系统评估和计算预测。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa099.
7
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
8
Iterative feature representations improve N4-methylcytosine site prediction.迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。
Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.
9
Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.探索基于序列的特征,以提高在多个物种中预测 DNA N4-甲基胞嘧啶位点的能力。
Bioinformatics. 2019 Apr 15;35(8):1326-1333. doi: 10.1093/bioinformatics/bty824.
10
Deep-4mCW2V: A sequence-based predictor to identify N4-methylcytosine sites in Escherichia coli.Deep-4mCW2V:一种基于序列的预测工具,用于鉴定大肠杆菌中的 N4-甲基胞嘧啶位点。
Methods. 2022 Jul;203:558-563. doi: 10.1016/j.ymeth.2021.07.011. Epub 2021 Aug 2.

引用本文的文献

1
Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction.基于时间序列的混合集成学习模型,具有多维多维特征编码,用于 DNA 甲基化预测。
BMC Genomics. 2023 Dec 11;24(1):758. doi: 10.1186/s12864-023-09866-5.
2
EMDL_m6Am: identifying N6,2'-O-dimethyladenosine sites based on stacking ensemble deep learning.基于堆积集深度学习的 N6,2'-O-二甲基腺苷位点识别。
BMC Bioinformatics. 2023 Oct 25;24(1):397. doi: 10.1186/s12859-023-05543-2.
3
Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning.
利用深度学习对DNA N4-甲基胞嘧啶甲基化位点进行比较评估与分析
Front Genet. 2023 Aug 21;14:1254827. doi: 10.3389/fgene.2023.1254827. eCollection 2023.
4
i4mC-GRU: Identifying DNA N-Methylcytosine sites in mouse genomes using bidirectional gated recurrent unit and sequence-embedded features.i4mC-GRU:利用双向门控循环单元和序列嵌入特征识别小鼠基因组中的DNA N-甲基胞嘧啶位点。
Comput Struct Biotechnol J. 2023 May 16;21:3045-3053. doi: 10.1016/j.csbj.2023.05.014. eCollection 2023.