Brief Bioinform. 2021 Mar 22;22(2):2043-2057. doi: 10.1093/bib/bbaa028.
Accumulating evidence has shown that microRNAs (miRNAs) play crucial roles in different biological processes, and their mutations and dysregulations have been proved to contribute to tumorigenesis. In silico identification of disease-associated miRNAs is a cost-effective strategy to discover those most promising biomarkers for disease diagnosis and treatment. The increasing available omics data sources provide unprecedented opportunities to decipher the underlying relationships between miRNAs and diseases by computational models. However, most existing methods are biased towards a single representation of miRNAs or diseases and are also not capable of discovering unobserved associations for new miRNAs or diseases without association information. In this study, we present a novel computational method with adaptive multi-source multi-view latent feature learning (M2LFL) to infer potential disease-associated miRNAs. First, we adopt multiple data sources to obtain similarity profiles and capture different latent features according to the geometric characteristic of miRNA and disease spaces. Then, the multi-modal latent features are projected to a common subspace to discover unobserved miRNA-disease associations in both miRNA and disease views, and an adaptive joint graph regularization term is developed to preserve the intrinsic manifold structures of multiple similarity profiles. Meanwhile, the Lp,q-norms are imposed into the projection matrices to ensure the sparsity and improve interpretability. The experimental results confirm the superior performance of our proposed method in screening reliable candidate disease miRNAs, which suggests that M2LFL could be an efficient tool to discover diagnostic biomarkers for guiding laborious clinical trials.
越来越多的证据表明,microRNAs(miRNAs)在不同的生物过程中发挥着关键作用,它们的突变和失调已被证明有助于肿瘤发生。通过计算方法识别与疾病相关的 miRNAs 是一种具有成本效益的策略,可以发现那些最有前途的疾病诊断和治疗生物标志物。越来越多的可利用的组学数据资源为通过计算模型揭示 miRNAs 和疾病之间的潜在关系提供了前所未有的机会。然而,大多数现有的方法偏向于 miRNAs 或疾病的单一表示形式,并且也无法在没有关联信息的情况下发现新的 miRNAs 或疾病的未观察到的关联。在这项研究中,我们提出了一种新的计算方法,具有自适应多源多视图潜在特征学习(M2LFL),以推断潜在的与疾病相关的 miRNAs。首先,我们采用多种数据源来获得相似性谱,并根据 miRNA 和疾病空间的几何特征捕获不同的潜在特征。然后,将多模态潜在特征投影到一个公共子空间中,以在 miRNA 和疾病视图中发现未观察到的 miRNA-疾病关联,并开发了一个自适应联合图正则化项来保留多个相似性谱的内在流形结构。同时,将 Lp,q-范数强加于投影矩阵中,以确保稀疏性并提高可解释性。实验结果证实了我们提出的方法在筛选可靠的候选疾病 miRNAs 方面的优越性能,这表明 M2LFL 可以成为发现诊断生物标志物的有效工具,以指导繁琐的临床试验。