Suppr超能文献

基于网格搜索的多层动态集成系统,利用深度学习方法识别 DNA N4-甲基胞嘧啶。

A Grid Search-Based Multilayer Dynamic Ensemble System to Identify DNA N4-Methylcytosine Using Deep Learning Approach.

机构信息

Department of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh.

School of Information Technology, Deakin University, Geelong 3125, Australia.

出版信息

Genes (Basel). 2023 Feb 25;14(3):582. doi: 10.3390/genes14030582.

Abstract

DNA (Deoxyribonucleic Acid) N4-methylcytosine (4mC), a kind of epigenetic modification of DNA, is important for modifying gene functions, such as protein interactions, conformation, and stability in DNA, as well as for the control of gene expression throughout cell development and genomic imprinting. This simply plays a crucial role in the restriction-modification system. To further understand the function and regulation mechanism of 4mC, it is essential to precisely locate the 4mC site and detect its chromosomal distribution. This research aims to design an efficient and high-throughput discriminative intelligent computational system using the natural language processing method "word2vec" and a multi-configured 1D convolution neural network (1D CNN) to predict 4mC sites. In this article, we propose a grid search-based multi-layer dynamic ensemble system (GS-MLDS) that can enhance existing knowledge of each level. Each layer uses a grid search-based weight searching approach to find the optimal accuracy while minimizing computation time and additional layers. We have used eight publicly available benchmark datasets collected from different sources to test the proposed model's efficiency. Accuracy results in test operations were obtained as follows: 0.978, 0.954, 0.944, 0.961, 0.950, 0.973, 0.948, 0.952, 0.961, and 0.980. The proposed model has also been compared to 16 distinct models, indicating that it can accurately predict 4mC.

摘要

DNA(脱氧核糖核酸)N4-甲基胞嘧啶(4mC)是 DNA 表观遗传修饰的一种,对于修饰基因功能非常重要,如蛋白质相互作用、DNA 构象和稳定性,以及控制细胞发育和基因组印记过程中的基因表达。它在限制修饰系统中起着至关重要的作用。为了进一步了解 4mC 的功能和调控机制,精确定位 4mC 位点并检测其染色体分布至关重要。本研究旨在设计一种高效、高通量的基于自然语言处理方法“word2vec”和多配置 1D 卷积神经网络(1D CNN)的区分式智能计算系统,用于预测 4mC 位点。在本文中,我们提出了一种基于网格搜索的多层动态集成系统(GS-MLDS),该系统可以增强每个层次的现有知识。每个层都使用基于网格搜索的权重搜索方法,在最小化计算时间和额外层的同时,找到最优的精度。我们使用了八个来自不同来源的公开可用的基准数据集来测试所提出模型的效率。在测试操作中获得的准确率结果如下:0.978、0.954、0.944、0.961、0.950、0.973、0.948、0.952、0.961 和 0.980。所提出的模型还与 16 个不同的模型进行了比较,表明它可以准确地预测 4mC。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/297d/10048346/a20ac97191d0/genes-14-00582-g001a.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验