Suppr超能文献

迭代特征表示可提高 N4-甲基胞嘧啶位点预测的准确性。

Iterative feature representations improve N4-methylcytosine site prediction.

机构信息

College of Intelligence and Computing, Tianjin University, Tianjin, China.

Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fujian, China.

出版信息

Bioinformatics. 2019 Dec 1;35(23):4930-4937. doi: 10.1093/bioinformatics/btz408.

Abstract

MOTIVATION

Accurate identification of N4-methylcytosine (4mC) modifications in a genome wide can provide insights into their biological functions and mechanisms. Machine learning recently have become effective approaches for computational identification of 4mC sites in genome. Unfortunately, existing methods cannot achieve satisfactory performance, owing to the lack of effective DNA feature representations that are capable to capture the characteristics of 4mC modifications.

RESULTS

In this work, we developed a new predictor named 4mcPred-IFL, aiming to identify 4mC sites. To represent and capture discriminative features, we proposed an iterative feature representation algorithm that enables to learn informative features from several sequential models in a supervised iterative mode. Our analysis results showed that the feature representations learnt by our algorithm can capture the discriminative distribution characteristics between 4mC sites and non-4mC sites, enlarging the decision margin between the positives and negatives in feature space. Additionally, by evaluating and comparing our predictor with the state-of-the-art predictors on benchmark datasets, we demonstrate that our predictor can identify 4mC sites more accurately.

AVAILABILITY AND IMPLEMENTATION

The user-friendly webserver that implements the proposed 4mcPred-IFL is well established, and is freely accessible at http://server.malab.cn/4mcPred-IFL.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在全基因组范围内准确识别 N4-甲基胞嘧啶(4mC)修饰可以深入了解其生物学功能和机制。机器学习最近已成为计算识别基因组中 4mC 位点的有效方法。不幸的是,由于缺乏能够捕获 4mC 修饰特征的有效 DNA 特征表示,现有的方法无法达到令人满意的性能。

结果

在这项工作中,我们开发了一个名为 4mcPred-IFL 的新预测器,旨在识别 4mC 位点。为了表示和捕获有区别的特征,我们提出了一种迭代特征表示算法,能够以监督迭代的方式从几个连续模型中学习信息丰富的特征。我们的分析结果表明,我们的算法学习的特征表示可以捕获 4mC 位点和非 4mC 位点之间的有区别的分布特征,在特征空间中扩大正负之间的决策边界。此外,通过在基准数据集上评估和比较我们的预测器与最先进的预测器,我们证明我们的预测器可以更准确地识别 4mC 位点。

可用性和实现

实现所提出的 4mcPred-IFL 的用户友好型网络服务器已经建立,并可在 http://server.malab.cn/4mcPred-IFL 上免费访问。

补充信息

补充数据可在生物信息学在线获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验