College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China.
College of Information Science and Engineering, Hohai University, Nanjing 210000, China.
Math Biosci Eng. 2024 Feb 19;21(3):3798-3815. doi: 10.3934/mbe.2024169.
The DNA N6-methyladenine (6mA) is an epigenetic modification, which plays a pivotal role in biological processes encompassing gene expression, DNA replication, repair, and recombination. Therefore, the precise identification of 6mA sites is fundamental for better understanding its function, but challenging. We proposed an improved ensemble-based method for predicting DNA N6-methyladenine sites in cross-species genomes called SoftVoting6mA. The SoftVoting6mA selected four (electron-ion-interaction pseudo potential, One-hot encoding, Kmer, and pseudo dinucleotide composition) codes from 15 types of encoding to represent DNA sequences by comparing their performances. Similarly, the SoftVoting6mA combined four learning algorithms using the soft voting strategy. The 5-fold cross-validation and the independent tests showed that SoftVoting6mA reached the state-of-the-art performance. To enhance accessibility, a user-friendly web server is provided at http://www.biolscience.cn/SoftVoting6mA/.
DNA N6-甲基腺嘌呤(6mA)是一种表观遗传修饰,在包括基因表达、DNA 复制、修复和重组在内的生物过程中发挥着关键作用。因此,准确识别 6mA 位点对于更好地理解其功能至关重要,但具有挑战性。我们提出了一种改进的基于集成的方法,用于预测跨物种基因组中的 DNA N6-甲基腺嘌呤位点,称为 SoftVoting6mA。SoftVoting6mA 通过比较性能,从 15 种编码类型中选择了四个(电子-离子相互作用伪势、独热编码、Kmer 和伪二核苷酸组成)编码来表示 DNA 序列。类似地,SoftVoting6mA 使用软投票策略组合了四种学习算法。5 折交叉验证和独立测试表明,SoftVoting6mA 达到了最先进的性能。为了提高可访问性,我们在 http://www.biolscience.cn/SoftVoting6mA/ 提供了一个用户友好的网络服务器。