School of Information Science and Engineering, Central South University, 932 South Lushan Rd, ChangSha, 410083, China.
School of Computer and Information,Qiannan Normal University for Nationalities, Longshan Road, DuYun, 558000, China.
BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):520. doi: 10.1186/s12859-018-2522-6.
Many evidences have demonstrated that circRNAs (circular RNA) play important roles in controlling gene expression of human, mouse and nematode. More importantly, circRNAs are also involved in many diseases through fine tuning of post-transcriptional gene expression by sequestering the miRNAs which associate with diseases. Therefore, identifying the circRNA-disease associations is very appealing to comprehensively understand the mechanism, treatment and diagnose of diseases, yet challenging. As the complex mechanism between circRNAs and diseases, wet-lab experiments are expensive and time-consuming to discover novel circRNA-disease associations. Therefore, it is of dire need to employ the computational methods to discover novel circRNA-disease associations.
In this study, we develop a method (DWNN-RLS) to predict circRNA-disease associations based on Regularized Least Squares of Kronecker product kernel. The similarity of circRNAs is computed from the Gaussian Interaction Profile(GIP) based on known circRNA-disease associations. In addition, the similarity of diseases is integrated by the mean of GIP similarity and sematic similarity which is computed by the direct acyclic graph (DAG) representation of diseases. The kernels of circRNA-disease pairs are constructed from the Kronecker product of the kernels of circRNAs and diseases. DWNN (decreasing weight k-nearest neighbor) method is adopted to calculate the initial relational score for new circRNAs and diseases. The Kronecker product kernel based regularised least squares approach is used to predict new circRNA-disease associations. We adopt 5-fold cross validation (5CV), 10-fold cross validation (10CV) and leave one out cross validation (LOOCV) to assess the prediction performance of our method, and compare it with other six competing methods (RLS-avg, RLS-Kron, NetLapRLS, KATZ, NBI, WP).
The experiment results show that DWNN-RLS reaches the AUC values of 0.8854, 0.9205 and 0.9701 in 5CV, 10CV and LOOCV, respectively, which illustrates that DWNN-RLS is superior to the competing methods RLS-avg, RLS-Kron, NetLapRLS, KATZ, NBI, WP. In addition, case studies also show that DWNN-RLS is an effective method to predict new circRNA-disease associations.
大量证据表明,circRNAs(环状 RNA)在人类、小鼠和线虫的基因表达调控中发挥着重要作用。更重要的是,circRNAs 通过与疾病相关的 miRNAs 的精细调控,参与了许多疾病的发生。因此,鉴定 circRNA-疾病关联对于全面了解疾病的发生机制、治疗和诊断具有重要意义,但这也是一个具有挑战性的问题。由于 circRNAs 与疾病之间的复杂机制,通过实验来发现新的 circRNA-疾病关联既昂贵又耗时。因此,迫切需要采用计算方法来发现新的 circRNA-疾病关联。
在这项研究中,我们提出了一种基于 Kronecker 积核正则化最小二乘法的方法(DWNN-RLS)来预测 circRNA-疾病关联。circRNA 的相似性是基于已知的 circRNA-疾病关联,通过基于高斯互作用分布(GIP)的相似性来计算的。此外,疾病的相似性是通过疾病的有向无环图(DAG)表示的 GIP 相似性和语义相似性的平均值来集成的。circRNA-疾病对的核函数是由 circRNA 和疾病的核函数的 Kronecker 积得到的。采用 DWNN(递减权 k-最近邻)方法计算新的 circRNA 和疾病的初始关联分数。采用 Kronecker 积核正则化最小二乘法预测新的 circRNA-疾病关联。我们采用 5 折交叉验证(5CV)、10 折交叉验证(10CV)和留一法交叉验证(LOOCV)来评估我们的方法的预测性能,并与其他六种竞争方法(RLS-avg、RLS-Kron、NetLapRLS、KATZ、NBI、WP)进行比较。
实验结果表明,DWNN-RLS 在 5CV、10CV 和 LOOCV 中的 AUC 值分别达到 0.8854、0.9205 和 0.9701,这表明 DWNN-RLS 优于竞争方法 RLS-avg、RLS-Kron、NetLapRLS、KATZ、NBI、WP。此外,案例研究也表明,DWNN-RLS 是一种有效的预测新的 circRNA-疾病关联的方法。