Oh Minsik, Rhee Sungmin, Moon Ji Hwan, Chae Heejoon, Lee Sunwon, Kang Jaewoo, Kim Sun
Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea.
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
PLoS One. 2017 Mar 31;12(3):e0174999. doi: 10.1371/journal.pone.0174999. eCollection 2017.
miRNAs are small non-coding RNAs that regulate gene expression by binding to the 3'-UTR of genes. Many recent studies have reported that miRNAs play important biological roles by regulating specific mRNAs or genes. Many sequence-based target prediction algorithms have been developed to predict miRNA targets. However, these methods are not designed for condition-specific target predictions and produce many false positives; thus, expression-based target prediction algorithms have been developed for condition-specific target predictions. A typical strategy to utilize expression data is to leverage the negative control roles of miRNAs on genes. To control false positives, a stringent cutoff value is typically set, but in this case, these methods tend to reject many true target relationships, i.e., false negatives. To overcome these limitations, additional information should be utilized. The literature is probably the best resource that we can utilize. Recent literature mining systems compile millions of articles with experiments designed for specific biological questions, and the systems provide a function to search for specific information. To utilize the literature information, we used a literature mining system, BEST, that automatically extracts information from the literature in PubMed and that allows the user to perform searches of the literature with any English words. By integrating omics data analysis methods and BEST, we developed Context-MMIA, a miRNA-mRNA target prediction method that combines expression data analysis results and the literature information extracted based on the user-specified context. In the pathway enrichment analysis using genes included in the top 200 miRNA-targets, Context-MMIA outperformed the four existing target prediction methods that we tested. In another test on whether prediction methods can re-produce experimentally validated target relationships, Context-MMIA outperformed the four existing target prediction methods. In summary, Context-MMIA allows the user to specify a context of the experimental data to predict miRNA targets, and we believe that Context-MMIA is very useful for predicting condition-specific miRNA targets.
微小RNA(miRNA)是一类小的非编码RNA,通过与基因的3'-非翻译区(3'-UTR)结合来调控基因表达。最近许多研究报道,miRNA通过调控特定的信使核糖核酸(mRNA)或基因发挥重要的生物学作用。已经开发了许多基于序列的靶标预测算法来预测miRNA靶标。然而,这些方法并非为特定条件下的靶标预测而设计,会产生许多假阳性结果;因此,基于表达的靶标预测算法已被开发用于特定条件下的靶标预测。利用表达数据的一种典型策略是利用miRNA对基因的负调控作用。为了控制假阳性,通常会设置一个严格的截止值,但在这种情况下,这些方法往往会排除许多真实的靶标关系,即假阴性。为了克服这些局限性,应利用额外的信息。文献可能是我们可以利用的最佳资源。最近的文献挖掘系统汇编了数百万篇针对特定生物学问题设计的实验文章,并且这些系统提供了搜索特定信息的功能。为了利用文献信息,我们使用了一个文献挖掘系统BEST,它能自动从PubMed中的文献中提取信息,并允许用户用任何英语单词搜索文献。通过整合组学数据分析方法和BEST,我们开发了Context-MMIA,这是一种miRNA-mRNA靶标预测方法,它结合了表达数据分析结果和基于用户指定背景提取的文献信息。在使用前200个miRNA靶标中包含的基因进行的通路富集分析中,Context-MMIA优于我们测试的四种现有的靶标预测方法。在另一项关于预测方法是否能重现经实验验证的靶标关系的测试中,Context-MMIA也优于四种现有的靶标预测方法。总之,Context-MMIA允许用户指定实验数据的背景来预测miRNA靶标,我们认为Context-MMIA对于预测特定条件下的miRNA靶标非常有用。