Suppr超能文献

MGF6mARice:利用分子图特征和残差块预测水稻中的 DNA N6-甲基腺嘌呤位点。

MGF6mARice: prediction of DNA N6-methyladenine sites in rice by exploiting molecular graph feature and residual block.

机构信息

School of Computer Science and Technology, Anhui University, Hefei, 230601, China.

School of Artificial Intelligence, Anhui University, Hefei, 230601, China.

出版信息

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac082.

Abstract

DNA N6-methyladenine (6mA) is produced by the N6 position of the adenine being methylated, which occurs at the molecular level, and is involved in numerous vital biological processes in the rice genome. Given the shortcomings of biological experiments, researchers have developed many computational methods to predict 6mA sites and achieved good performance. However, the existing methods do not consider the occurrence mechanism of 6mA to extract features from the molecular structure. In this paper, a novel deep learning method is proposed by devising DNA molecular graph feature and residual block structure for 6mA sites prediction in rice, named MGF6mARice. Firstly, the DNA sequence is changed into a simplified molecular input line entry system (SMILES) format, which reflects chemical molecular structure. Secondly, for the molecular structure data, we construct the DNA molecular graph feature based on the principle of graph convolutional network. Then, the residual block is designed to extract higher level, distinguishable features from molecular graph features. Finally, the prediction module is used to obtain the result of whether it is a 6mA site. By means of 10-fold cross-validation, MGF6mARice outperforms the state-of-the-art approaches. Multiple experiments have shown that the molecular graph feature and residual block can promote the performance of MGF6mARice in 6mA prediction. To the best of our knowledge, it is the first time to derive a feature of DNA sequence by considering the chemical molecular structure. We hope that MGF6mARice will be helpful for researchers to analyze 6mA sites in rice.

摘要

DNA N6-甲基腺嘌呤(6mA)是由腺嘌呤的 N6 位甲基化产生的,这种甲基化发生在分子水平上,参与了水稻基因组中的许多重要的生物过程。鉴于生物实验的局限性,研究人员已经开发了许多计算方法来预测 6mA 位点,并取得了良好的性能。然而,现有的方法并没有考虑 6mA 的发生机制,无法从分子结构中提取特征。本文提出了一种新的深度学习方法,通过设计 DNA 分子图特征和残差块结构,用于预测水稻中的 6mA 位点,命名为 MGF6mARice。首先,将 DNA 序列转换为简化分子输入行系统(SMILES)格式,反映化学分子结构。其次,对于分子结构数据,我们基于图卷积网络的原理构建了 DNA 分子图特征。然后,设计了残差块来从分子图特征中提取更高层次、可区分的特征。最后,预测模块用于获得是否为 6mA 位点的结果。通过 10 折交叉验证,MGF6mARice 的性能优于最先进的方法。多项实验表明,分子图特征和残差块可以提高 MGF6mARice 在 6mA 预测中的性能。据我们所知,这是首次通过考虑化学分子结构来推导出 DNA 序列的特征。我们希望 MGF6mARice 能帮助研究人员分析水稻中的 6mA 位点。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验