Pian Cong, Yang Zhixin, Yang Yuqian, Zhang Liangyun, Chen Yuanyuan
College of Science, Nanjing Agricultural University, Nanjing, China.
Front Genet. 2021 Mar 19;12:650803. doi: 10.3389/fgene.2021.650803. eCollection 2021.
N6-methyladenosine (m6A), the most common posttranscriptional modification in eukaryotic mRNAs, plays an important role in mRNA splicing, editing, stability, degradation, etc. Since the methylation state is dynamic, methylation sequencing needs to be carried out over different time periods, which brings some difficulties to identify the RNA methyladenine sites. Thus, it is necessary to develop a fast and accurate method to identify the RNA N6-methyladenosine sites in the transcriptome. In this study, we use first-order and second-order Markov models to identify RNA N6-methyladenine sites in three species (, mouse, and ). These two methods can fully consider the correlation between adjacent nucleotides. The results show that the performance of our method is better than that of other existing methods. Furthermore, the codons encoded by three nucleotides have biases in mRNA, and a second-order Markov model can capture this kind of information exactly. This may be the main reason why the performance of the second-order Markov model is better than that of the first-order Markov model in the m6A prediction problem. In addition, we provide a corresponding web tool called MM-m6APred.
N6-甲基腺嘌呤(m6A)是真核生物mRNA中最常见的转录后修饰,在mRNA剪接、编辑、稳定性、降解等过程中发挥着重要作用。由于甲基化状态是动态的,需要在不同时间段进行甲基化测序,这给鉴定RNA甲基腺嘌呤位点带来了一些困难。因此,有必要开发一种快速准确的方法来鉴定转录组中的RNA N6-甲基腺嘌呤位点。在本研究中,我们使用一阶和二阶马尔可夫模型来鉴定三种物种(小鼠、人和 )中的RNA N6-甲基腺嘌呤位点。这两种方法可以充分考虑相邻核苷酸之间的相关性。结果表明,我们方法的性能优于其他现有方法。此外,由三个核苷酸编码的密码子在mRNA中存在偏好性,二阶马尔可夫模型能够准确捕捉此类信息。这可能是二阶马尔可夫模型在m6A预测问题中性能优于一阶马尔可夫模型的主要原因。此外,我们提供了一个名为MM-m6APred的相应网络工具。