Uthayopas Korawich, de Sá Alex G C, Alavi Azadeh, Pires Douglas E V, Ascher David B
Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.
Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.
Mol Ther Nucleic Acids. 2021 Aug 26;26:536-546. doi: 10.1016/j.omtn.2021.08.016. eCollection 2021 Dec 3.
The emergence of high-throughput sequencing techniques has revealed a primary role of microRNAs (miRNAs) in a wide range of diseases, including cancers and neurodegenerative disorders. Understanding novel relationships between miRNAs and diseases can potentially unveil complex pathogenesis mechanisms, leading to effective diagnosis and treatment. The investigation of novel miRNA-disease associations, however, is currently costly and time consuming. Over the years, several computational models have been proposed to prioritize potential miRNA-disease associations, but with limited usability or predictive capability. In order to fill this gap, we introduce TSMDA, a novel machine-learning method that leverages target and symptom information and negative sample selection to predict miRNA-disease association. TSMDA significantly outperforms similar methods, achieving an area under the receiver operating characteristic (ROC) curve (AUC) of 0.989 and 0.982 under 5-fold cross-validation and blind test, respectively. We also demonstrate the capability of the method to uncover potential miRNA-disease associations in breast, prostate, and lung cancers, as case studies. We believe TSMDA will be an invaluable tool for the community to explore and prioritize potentially new miRNA-disease associations for further experimental characterization. The method was made available as a freely accessible and user-friendly web interface at http://biosig.unimelb.edu.au/tsmda/.
高通量测序技术的出现揭示了微小RNA(miRNA)在包括癌症和神经退行性疾病在内的多种疾病中的主要作用。了解miRNA与疾病之间的新关系可能会揭示复杂的发病机制,从而实现有效的诊断和治疗。然而,目前对新型miRNA-疾病关联的研究成本高昂且耗时。多年来,已经提出了几种计算模型来对潜在的miRNA-疾病关联进行优先级排序,但可用性或预测能力有限。为了填补这一空白,我们引入了TSMDA,这是一种新颖的机器学习方法,它利用靶标和症状信息以及负样本选择来预测miRNA-疾病关联。TSMDA明显优于类似方法,在5折交叉验证和盲测下,受试者操作特征(ROC)曲线下面积(AUC)分别达到0.989和0.982。作为案例研究,我们还展示了该方法在揭示乳腺癌、前列腺癌和肺癌中潜在的miRNA-疾病关联方面的能力。我们相信TSMDA将成为该领域探索和优先考虑潜在的新miRNA-疾病关联以进行进一步实验表征的宝贵工具。该方法通过一个免费访问且用户友好的网页界面提供,网址为http://biosig.unimelb.edu.au/tsmda/。