Suppr超能文献

使用人类表型本体注释疾病可提高疾病相关长非编码 RNA 的预测能力。

Annotating Diseases Using Human Phenotype Ontology Improves Prediction of Disease-Associated Long Non-coding RNAs.

机构信息

School of Computer Science and Engineering, Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Vietnam; Vinmec Research Institute of Stem Cell and Gene Technology, 458 Minh Khai, Hai Ba Trung, Hanoi, Vietnam.

Vinmec Research Institute of Stem Cell and Gene Technology, 458 Minh Khai, Hai Ba Trung, Hanoi, Vietnam.

出版信息

J Mol Biol. 2018 Jul 20;430(15):2219-2230. doi: 10.1016/j.jmb.2018.05.006. Epub 2018 May 24.

Abstract

Recently, many long non-coding RNAs (lncRNAs) have been identified and their biological function has been characterized; however, our understanding of their underlying molecular mechanisms related to disease is still limited. To overcome the limitation in experimentally identifying disease-lncRNA associations, computational methods have been proposed as a powerful tool to predict such associations. These methods are usually based on the similarities between diseases or lncRNAs since it was reported that similar diseases are associated with functionally similar lncRNAs. Therefore, prediction performance is highly dependent on how well the similarities can be captured. Previous studies have calculated the similarity between two diseases by mapping exactly each disease to a single Disease Ontology (DO) term, and then use a semantic similarity measure to calculate the similarity between them. However, the problem of this approach is that a disease can be described by more than one DO terms. Until now, there is no annotation database of DO terms for diseases except for genes. In contrast, Human Phenotype Ontology (HPO) is designed to fully annotate human disease phenotypes. Therefore, in this study, we constructed disease similarity networks/matrices using HPO instead of DO. Then, we used these networks/matrices as inputs of two representative machine learning-based and network-based ranking algorithms, that is, regularized least square and heterogeneous graph-based inference, respectively. The results showed that the prediction performance of the two algorithms on HPO-based is better than that on DO-based networks/matrices. In addition, our method can predict 11 novel cancer-associated lncRNAs, which are supported by literature evidence.

摘要

最近,已经鉴定出许多长链非编码 RNA(lncRNA),并对其生物学功能进行了描述;然而,我们对其与疾病相关的潜在分子机制的理解仍然有限。为了克服在实验中鉴定疾病-lncRNA 关联的局限性,已经提出了计算方法作为预测这种关联的有力工具。这些方法通常基于疾病或 lncRNA 之间的相似性,因为据报道,相似的疾病与功能相似的 lncRNA 相关。因此,预测性能高度依赖于相似性的捕捉程度。以前的研究通过将每个疾病精确映射到单个疾病本体(DO)术语上来计算两个疾病之间的相似性,然后使用语义相似性度量来计算它们之间的相似性。然而,这种方法的问题在于,一种疾病可以用多个 DO 术语来描述。到目前为止,除了基因之外,还没有针对疾病的 DO 术语注释数据库。相比之下,人类表型本体(HPO)旨在全面注释人类疾病表型。因此,在这项研究中,我们使用 HPO 而不是 DO 来构建疾病相似性网络/矩阵。然后,我们将这些网络/矩阵作为两种有代表性的基于机器学习和基于网络的排名算法的输入,即正则化最小二乘和异构图推理。结果表明,两种算法在基于 HPO 的网络/矩阵上的预测性能优于基于 DO 的网络/矩阵。此外,我们的方法可以预测 11 个新的癌症相关 lncRNA,这些 lncRNA得到了文献证据的支持。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验