Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea.
School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea.
Bioinformatics. 2022 Aug 10;38(16):3885-3891. doi: 10.1093/bioinformatics/btac434.
DNA N6-methyladenine (6mA) has been demonstrated to have an essential function in epigenetic modification in eukaryotic species in recent research. 6mA has been linked to various biological processes. It's critical to create a new algorithm that can rapidly and reliably detect 6mA sites in genomes to investigate their biological roles. The identification of 6mA marks in the genome is the first and most important step in understanding the underlying molecular processes, as well as their regulatory functions.
In this article, we proposed a novel computational tool called i6mA-Caps which CapsuleNet based a framework for identifying the DNA N6-methyladenine sites. The proposed framework uses a single encoding scheme for numerical representation of the DNA sequence. The numerical data is then used by the set of convolution layers to extract low-level features. These features are then used by the capsule network to extract intermediate-level and later high-level features to classify the 6mA sites. The proposed network is evaluated on three datasets belonging to three genomes which are Rosaceae, Rice and Arabidopsis thaliana. Proposed method has attained an accuracy of 96.71%, 94% and 86.83% for independent Rosaceae dataset, Rice dataset and A.thaliana dataset respectively. The proposed framework has exhibited improved results when compared with the existing top-of-the-line methods.
A user-friendly web-server is made available for the biological experts which can be accessed at: http://nsclbio.jbnu.ac.kr/tools/i6mA-Caps/.
Supplementary data are available at Bioinformatics online.
在最近的研究中,已经证明 DNA N6-甲基腺嘌呤(6mA)在真核生物的表观遗传修饰中具有重要功能。6mA 与各种生物过程有关。因此,开发一种新的算法来快速可靠地检测基因组中的 6mA 位点,以研究其生物学功能至关重要。鉴定基因组中的 6mA 标记是了解潜在分子过程及其调控功能的第一步,也是最重要的一步。
在本文中,我们提出了一种名为 i6mA-Caps 的新计算工具,它基于胶囊网络框架来识别 DNA N6-甲基腺嘌呤位点。所提出的框架使用单一编码方案对 DNA 序列进行数值表示。然后,使用一组卷积层对数值数据进行提取,以提取低级特征。然后,使用胶囊网络对这些特征进行提取,以提取中间级和高级特征,从而对 6mA 位点进行分类。在所提出的方法中,对属于三个基因组的三个数据集(蔷薇科、水稻和拟南芥)进行了评估。对于独立的蔷薇科数据集、水稻数据集和拟南芥数据集,所提出的方法分别达到了 96.71%、94%和 86.83%的准确率。与现有的顶级方法相比,所提出的框架显示出了改进的结果。
为生物专家提供了一个用户友好的网络服务器,可以通过以下网址访问:http://nsclbio.jbnu.ac.kr/tools/i6mA-Caps/。
补充数据可在生物信息学在线获得。