IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):1774-1782. doi: 10.1109/TCBB.2022.3215194. Epub 2023 Jun 5.
With the development of bioinformatics, the important role played by lncRNAs in various intractable diseases has aroused the interest of many experts. In recent studies, researchers have found that several human diseases are related to lncRANs. Moreover, it is very difficult and expensive to explore the unknown lncRNA-disease associations (LDAs), so only a few associations have been confirmed. It is vital to find a more accurate and effective method to identify potential LDAs. In this study, a method of collaborative matrix factorization based on correntropy (LDCMFC) is proposed for the identification of potential LDAs. To improve the robustness of the algorithm, the traditional minimization of the Euclidean distance is replaced with the maximized correntropy. In addition, the weighted K nearest known neighbor (WKNKN) method is used to rebuild the adjacency matrix. Finally, the performance of LDCMFC is tested by 5-fold cross-validation. Compared with other traditional methods, LDACMFC obtains a higher AUC of 0.8628. In different types of studies of three important cancer cases, most of the potentially relevant lncRNAs derived from the experiments have been validated in the databases. The final result shows that LDCMFC is a feasible method to predict LDAs.
随着生物信息学的发展,lncRNAs 在各种难治性疾病中发挥的重要作用引起了许多专家的兴趣。在最近的研究中,研究人员发现几种人类疾病与 lncRANs 有关。而且,探索未知的 lncRNA-疾病关联(LDAs)既困难又昂贵,因此仅确认了少数关联。找到更准确有效的方法来识别潜在的 LDAs 至关重要。在这项研究中,提出了一种基于相关熵的协同矩阵分解方法(LDCMFC)来识别潜在的 LDAs。为了提高算法的稳健性,传统的欧几里得距离最小化被最大化相关熵所取代。此外,使用加权 K 最近已知邻居(WKNKN)方法来重建邻接矩阵。最后,通过 5 倍交叉验证测试了 LDCMFC 的性能。与其他传统方法相比,LDACMFC 获得了 0.8628 的更高 AUC。在三种重要癌症病例的不同类型研究中,实验中得出的大多数潜在相关 lncRNAs 都已在数据库中得到验证。最终结果表明,LDCMFC 是一种预测 LDAs 的可行方法。