Suppr超能文献

用于识别DNA N4-甲基胞嘧啶位点的深度神经网络

A Deep Neural Network for Identifying DNA N4-Methylcytosine Sites.

作者信息

Zeng Feng, Fang Guanyun, Yao Lan

机构信息

School of Computer Science and Engineering, Central South University, Changsha, China.

College of Mathematics and Econometrics, Hunan University, Changsha, China.

出版信息

Front Genet. 2020 Mar 6;11:209. doi: 10.3389/fgene.2020.00209. eCollection 2020.

Abstract

N4-methylcytosine (4mC) plays an important role in host defense and transcriptional regulation. Accurate identification of 4mc sites provides a more comprehensive understanding of its biological effects. At present, the traditional machine learning algorithms are used in the research on 4mC sites prediction, but the complexity of the algorithms is relatively high, which is not suitable for the processing of large data sets, and the accuracy of prediction needs to be improved. Therefore, it is necessary to develop a new and effective method to accurately identify 4mC sites. In this work, we found a large number of 4mC sites and non 4mC sites of () from the latest MethSMRT website, which greatly expanded the dataset of , and developed a hybrid deep neural network framework named 4mcDeep-CBI, aiming to identify 4mC sites. In order to obtain the high latitude information of the feature, we input the preliminary extracted features into the Convolutional Neural Network (CNN) and Bidirectional Long Short Term Memory network (BLSTM) to generate advanced features. Taking the advanced features as algorithm input, we have proposed an integrated algorithm to improve feature representation. Experimental results on large new dataset show that the proposed predictor is able to achieve generally better performance in identifying 4mC sites as compared to the state-of-art predictor. Notably, this is the first study of identifying 4mC sites using deep neural network. Moreover, our model runs much faster than the state-of-art predictor.

摘要

N4-甲基胞嘧啶(4mC)在宿主防御和转录调控中发挥着重要作用。准确识别4mC位点有助于更全面地了解其生物学效应。目前,传统机器学习算法被用于4mC位点预测研究,但算法复杂度较高,不适用于大数据集处理,预测准确性有待提高。因此,有必要开发一种新的有效方法来准确识别4mC位点。在这项工作中,我们从最新的MethSMRT网站上找到了大量()的4mC位点和非4mC位点,极大地扩展了()的数据集,并开发了一种名为4mcDeep-CBI的混合深度神经网络框架,旨在识别4mC位点。为了获取特征的高纬度信息,我们将初步提取的特征输入到卷积神经网络(CNN)和双向长短期记忆网络(BLSTM)中以生成高级特征。以高级特征作为算法输入,我们提出了一种集成算法来改进特征表示。在大型新数据集上的实验结果表明,与现有最佳预测器相比,所提出的预测器在识别4mC位点方面通常能够取得更好的性能。值得注意的是,这是首次使用深度神经网络识别4mC位点的研究。此外,我们的模型运行速度比现有最佳预测器快得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f41c/7067889/d740edd80073/fgene-11-00209-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验