Pasquier Claude, Gardès Julien
University of Nice Sophia Antipolis, I3S, UMR 7271, 06900 Sophia Antipolis, France.
CNRS, I3S, UMR 7271, 06900 Sophia Antipolis, France.
Sci Rep. 2016 Jun 1;6:27036. doi: 10.1038/srep27036.
MicroRNAs play critical roles in many physiological processes. Their dysregulations are also closely related to the development and progression of various human diseases, including cancer. Therefore, identifying new microRNAs that are associated with diseases contributes to a better understanding of pathogenicity mechanisms. MicroRNAs also represent a tremendous opportunity in biotechnology for early diagnosis. To date, several in silico methods have been developed to address the issue of microRNA-disease association prediction. However, these methods have various limitations. In this study, we investigate the hypothesis that information attached to miRNAs and diseases can be revealed by distributional semantics. Our basic approach is to represent distributional information on miRNAs and diseases in a high-dimensional vector space and to define associations between miRNAs and diseases in terms of their vector similarity. Cross validations performed on a dataset of known miRNA-disease associations demonstrate the excellent performance of our method. Moreover, the case study focused on breast cancer confirms the ability of our method to discover new disease-miRNA associations and to identify putative false associations reported in databases.
微小RNA在许多生理过程中发挥着关键作用。它们的失调也与包括癌症在内的各种人类疾病的发生和发展密切相关。因此,鉴定与疾病相关的新微小RNA有助于更好地理解致病机制。微小RNA在生物技术领域也为早期诊断提供了巨大机遇。迄今为止,已经开发了几种计算机方法来解决微小RNA-疾病关联预测问题。然而,这些方法存在各种局限性。在本研究中,我们探讨了一个假设,即可以通过分布语义揭示附着于微小RNA和疾病的信息。我们的基本方法是在高维向量空间中表示微小RNA和疾病的分布信息,并根据它们的向量相似性定义微小RNA与疾病之间的关联。在已知微小RNA-疾病关联的数据集上进行的交叉验证证明了我们方法的优异性能。此外,以乳腺癌为重点的案例研究证实了我们的方法能够发现新的疾病-微小RNA关联,并识别数据库中报告的假定错误关联。