Suppr超能文献

基于卷积的计算方法,用于鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点并提取其基序。

A convolution based computational approach towards DNA N6-methyladenine site identification and motif extraction in rice genome.

机构信息

United International University, Dhaka, Bangladesh.

出版信息

Sci Rep. 2021 May 14;11(1):10357. doi: 10.1038/s41598-021-89850-9.

Abstract

DNA N6-methylation (6mA) in Adenine nucleotide is a post replication modification responsible for many biological functions. Automated and accurate computational methods can help to identify 6mA sites in long genomes saving significant time and money. Our study develops a convolutional neural network (CNN) based tool i6mA-CNN capable of identifying 6mA sites in the rice genome. Our model coordinates among multiple types of features such as PseAAC (Pseudo Amino Acid Composition) inspired customized feature vector, multiple one hot representations and dinucleotide physicochemical properties. It achieves auROC (area under Receiver Operating Characteristic curve) score of 0.98 with an overall accuracy of 93.97% using fivefold cross validation on benchmark dataset. Finally, we evaluate our model on three other plant genome 6mA site identification test datasets. Results suggest that our proposed tool is able to generalize its ability of 6mA site identification on plant genomes irrespective of plant species. An algorithm for potential motif extraction and a feature importance analysis procedure are two by products of this research. Web tool for this research can be found at: https://cutt.ly/dgp3QTR .

摘要

腺嘌呤核苷酸中的 DNA N6-甲基化 (6mA) 是一种复制后修饰,负责许多生物功能。自动化和准确的计算方法可以帮助识别长基因组中的 6mA 位点,从而节省大量的时间和金钱。我们的研究开发了一种基于卷积神经网络 (CNN) 的工具 i6mA-CNN,能够识别水稻基因组中的 6mA 位点。我们的模型协调了多种类型的特征,如 PseAAC(伪氨基酸组成)启发的定制特征向量、多种独热表示和二核苷酸物理化学性质。在基准数据集上使用五折交叉验证,我们的模型实现了 auROC(接收器操作特征曲线下的面积)得分为 0.98,整体准确性为 93.97%。最后,我们在另外三个植物基因组 6mA 位点识别测试数据集上评估了我们的模型。结果表明,我们提出的工具能够在植物基因组上识别 6mA 位点的能力,而与植物物种无关。该研究的两个副产品是潜在基序提取算法和特征重要性分析过程。该研究的网络工具可在:https://cutt.ly/dgp3QTR 找到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2599/8121938/1091d3ef34f6/41598_2021_89850_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验