Yousef Malik, Goy Gokhan, Mitra Ramkrishna, Eischen Christine M, Jabeer Amhar, Bakir-Gungor Burcu
Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel.
Department of Information Systems, Zefat Academic College, Zefat, Israel.
PeerJ. 2021 May 19;9:e11458. doi: 10.7717/peerj.11458. eCollection 2021.
A better understanding of disease development and progression mechanisms at the molecular level is critical both for the diagnosis of a disease and for the development of therapeutic approaches. The advancements in high throughput technologies allowed to generate mRNA and microRNA (miRNA) expression profiles; and the integrative analysis of these profiles allowed to uncover the functional effects of RNA expression in complex diseases, such as cancer. Several researches attempt to integrate miRNA and mRNA expression profiles using statistical methods such as Pearson correlation, and then combine it with enrichment analysis. In this study, we developed a novel tool called miRcorrNet, which performs machine learning-based integration to analyze miRNA and mRNA gene expression profiles. miRcorrNet groups mRNAs based on their correlation to miRNA expression levels and hence it generates groups of target genes associated with each miRNA. Then, these groups are subject to a rank function for classification. We have evaluated our tool using miRNA and mRNA expression profiling data downloaded from The Cancer Genome Atlas (TCGA), and performed comparative evaluation with existing tools. In our experiments we show that miRcorrNet performs as good as other tools in terms of accuracy (reaching more than 95% AUC value). Additionally, miRcorrNet includes ranking steps to separate two classes, namely case and control, which is not available in other tools. We have also evaluated the performance of miRcorrNet using a completely independent dataset. Moreover, we conducted a comprehensive literature search to explore the biological functions of the identified miRNAs. We have validated our significantly identified miRNA groups against known databases, which yielded about 90% accuracy. Our results suggest that miRcorrNet is able to accurately prioritize pan-cancer regulating high-confidence miRNAs. miRcorrNet tool and all other supplementary files are available at https://github.com/malikyousef/miRcorrNet.
在分子水平上更好地理解疾病发展和进展机制对于疾病诊断和治疗方法的开发都至关重要。高通量技术的进步使得能够生成mRNA和微小RNA(miRNA)表达谱;对这些谱进行综合分析有助于揭示RNA表达在复杂疾病(如癌症)中的功能作用。一些研究尝试使用Pearson相关性等统计方法整合miRNA和mRNA表达谱,然后将其与富集分析相结合。在本研究中,我们开发了一种名为miRcorrNet的新型工具,它基于机器学习进行整合,以分析miRNA和mRNA基因表达谱。miRcorrNet根据mRNA与miRNA表达水平的相关性对mRNA进行分组,从而生成与每个miRNA相关的靶基因组。然后,对这些组应用排名函数进行分类。我们使用从癌症基因组图谱(TCGA)下载的miRNA和mRNA表达谱数据对我们的工具进行了评估,并与现有工具进行了比较评估。在我们的实验中,我们表明miRcorrNet在准确性方面(AUC值达到95%以上)与其他工具表现相当。此外,miRcorrNet包括用于区分病例和对照这两类的排名步骤,这是其他工具所没有的。我们还使用一个完全独立的数据集评估了miRcorrNet的性能。此外,我们进行了全面的文献检索,以探索所鉴定miRNA的生物学功能。我们已针对已知数据库验证了我们显著鉴定出的miRNA组,准确率约为90%。我们的结果表明,miRcorrNet能够准确地对泛癌调节高可信度miRNA进行优先级排序。miRcorrNet工具及所有其他补充文件可在https://github.com/malikyousef/miRcorrNet获取。