Department of Computer Science, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
Department of Genomics, Institute of Hematology and Blood Transfusion, Prague, Czech Republic.
BMC Bioinformatics. 2022 Sep 27;23(1):392. doi: 10.1186/s12859-022-04957-8.
Recent research has already shown that circular RNAs (circRNAs) are functional in gene expression regulation and potentially related to diseases. Due to their stability, circRNAs can also be used as biomarkers for diagnosis. However, the function of most circRNAs remains unknown, and it is expensive and time-consuming to discover it through biological experiments. In this paper, we predict circRNA annotations from the knowledge of their interaction with miRNAs and subsequent miRNA-mRNA interactions. First, we construct an interaction network for a target circRNA and secondly spread the information from the network nodes with the known function to the root circRNA node. This idea itself is not new; our main contribution lies in proposing an efficient and exact deterministic procedure based on the principle of probability-generating functions to calculate the p-value of association test between a circRNA and an annotation term. We show that our publicly available algorithm is both more effective and efficient than the commonly used Monte-Carlo sampling approach that may suffer from difficult quantification of sampling convergence and subsequent sampling inefficiency. We experimentally demonstrate that the new approach is two orders of magnitude faster than the Monte-Carlo sampling, which makes summary annotation of large circRNA files feasible; this includes their reannotation after periodical interaction network updates, for example. We provide a summary annotation of a current circRNA database as one of our outputs. The proposed algorithm could be generalized towards other types of RNA in way that is straightforward.
最近的研究已经表明,环状 RNA(circRNAs)在基因表达调控中具有功能,并且可能与疾病有关。由于其稳定性,circRNAs 也可以用作诊断的生物标志物。然而,大多数 circRNAs 的功能仍然未知,通过生物实验发现它们既昂贵又耗时。在本文中,我们从 circRNA 与 miRNAs 相互作用及其后续 miRNA-mRNA 相互作用的知识中预测 circRNA 的注释。首先,我们构建了目标 circRNA 的相互作用网络,其次将具有已知功能的网络节点的信息传播到根 circRNA 节点。这个想法本身并不新鲜;我们的主要贡献在于提出了一种基于生成函数原理的高效、准确的确定性过程,用于计算 circRNA 和注释项之间关联测试的 p 值。我们表明,我们的公开算法不仅比常用的可能受到抽样收敛性难以量化和随后抽样效率低下影响的蒙特卡罗抽样方法更有效,而且更高效。我们通过实验证明,新方法比蒙特卡罗抽样快两个数量级,这使得对大型 circRNA 文件进行摘要注释成为可能;例如,这包括定期更新相互作用网络后的重新注释。我们提供了当前 circRNA 数据库的摘要注释作为我们的输出之一。可以通过一种直接的方式将所提出的算法推广到其他类型的 RNA。