Wang Pan, Li Qi, Sun Nan, Gao Yibo, Liu Jun S, Deng Ke, He Jie
Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Center for Statistical Science & Department of Industry Engineering, Tsinghua University, Beijing, China.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa117.
Deciphering microRNA (miRNA) targets is important for understanding the function of miRNAs as well as miRNA-based diagnostics and therapeutics. Given the highly cell-specific nature of miRNA regulation, recent computational approaches typically exploit expression data to identify the most physiologically relevant target messenger RNAs (mRNAs). Although effective, those methods usually require a large sample size to infer miRNA-mRNA interactions, thus limiting their applications in personalized medicine. In this study, we developed a novel miRNA target prediction algorithm called miRACLe (miRNA Analysis by a Contact modeL). It integrates sequence characteristics and RNA expression profiles into a random contact model, and determines the target preferences by relative probability of effective contacts in an individual-specific manner. Evaluation by a variety of measures shows that fitting TargetScan, a frequently used prediction tool, into the framework of miRACLe can improve its predictive power with a significant margin and consistently outperform other state-of-the-art methods in prediction accuracy, regulatory potential and biological relevance. Notably, the superiority of miRACLe is robust to various biological contexts, types of expression data and validation datasets, and the computation process is fast and efficient. Additionally, we show that the model can be readily applied to other sequence-based algorithms to improve their predictive power, such as DIANA-microT-CDS, miRanda-mirSVR and MirTarget4. MiRACLe is publicly available at https://github.com/PANWANG2014/miRACLe.
破译微小RNA(miRNA)的靶标对于理解miRNA的功能以及基于miRNA的诊断和治疗方法至关重要。鉴于miRNA调控具有高度的细胞特异性,最近的计算方法通常利用表达数据来识别最具生理相关性的靶标信使核糖核酸(mRNA)。尽管这些方法很有效,但通常需要大量样本才能推断miRNA与mRNA之间的相互作用,从而限制了它们在个性化医疗中的应用。在本研究中,我们开发了一种名为miRACLe(通过接触模型进行miRNA分析)的新型miRNA靶标预测算法。它将序列特征和RNA表达谱整合到一个随机接触模型中,并以个体特异性方式通过有效接触的相对概率来确定靶标偏好。通过多种指标进行评估表明,将常用的预测工具TargetScan纳入miRACLe框架可以显著提高其预测能力,并且在预测准确性、调控潜力和生物学相关性方面始终优于其他最先进的方法。值得注意的是,miRACLe的优势在各种生物学背景、表达数据类型和验证数据集中都很稳健,并且计算过程快速高效。此外,我们表明该模型可以很容易地应用于其他基于序列的算法,以提高它们的预测能力,如DIANA-microT-CDS、miRanda-mirSVR和MirTarget4。miRACLe可在https://github.com/PANWANG2014/miRACLe上公开获取。