Bachelor of Engineering in Software Engineering from Sichuan University.
Chengdu University of Traditional Chinese Medicine.
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa278.
N7-methylguanosine (m7G) is an important epigenetic modification, playing an essential role in gene expression regulation. Therefore, accurate identification of m7G modifications will facilitate revealing and in-depth understanding their potential functional mechanisms. Although high-throughput experimental methods are capable of precisely locating m7G sites, they are still cost ineffective. Therefore, it's necessary to develop new methods to identify m7G sites.
In this work, by using the iterative feature representation algorithm, we developed a machine learning based method, namely m7G-IFL, to identify m7G sites. To demonstrate its superiority, m7G-IFL was evaluated and compared with existing predictors. The results demonstrate that our predictor outperforms existing predictors in terms of accuracy for identifying m7G sites. By analyzing and comparing the features used in the predictors, we found that the positive and negative samples in our feature space were more separated than in existing feature space. This result demonstrates that our features extracted more discriminative information via the iterative feature learning process, and thus contributed to the predictive performance improvement.
N7-甲基鸟嘌呤(m7G)是一种重要的表观遗传修饰,在基因表达调控中起着至关重要的作用。因此,准确识别 m7G 修饰将有助于揭示和深入了解其潜在的功能机制。尽管高通量实验方法能够精确地定位 m7G 位点,但它们仍然成本高昂。因此,有必要开发新的方法来识别 m7G 位点。
在这项工作中,我们通过使用迭代特征表示算法,开发了一种基于机器学习的方法,即 m7G-IFL,用于识别 m7G 位点。为了证明其优越性,我们将 m7G-IFL 与现有的预测器进行了评估和比较。结果表明,我们的预测器在识别 m7G 位点的准确性方面优于现有的预测器。通过分析和比较预测器中使用的特征,我们发现我们特征空间中的正负样本比现有特征空间中的更分离。这一结果表明,我们的特征通过迭代特征学习过程提取了更具判别力的信息,从而有助于提高预测性能。