School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom.
Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, United Kingdom.
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae523.
Heterogeneity in human diseases presents challenges in diagnosis and treatments due to the broad range of manifestations and symptoms. With the rapid development of labelled multi-omic data, integrative machine learning methods have achieved breakthroughs in treatments by redefining these diseases at a more granular level. These approaches often have limitations in scalability, oversimplification, and handling of missing data.
In this study, we introduce Multi-Omic Graph Diagnosis (MOGDx), a flexible command line tool for the integration of multi-omic data to perform classification tasks for heterogeneous diseases. MOGDx has a network taxonomy. It fuses patient similarity networks, augments this integrated network with a reduced vector representation of genomic data and performs classification using a graph convolutional network. MOGDx was evaluated on three datasets from the cancer genome atlas for breast invasive carcinoma, kidney cancer, and low grade glioma. MOGDx demonstrated state-of-the-art performance and an ability to identify relevant multi-omic markers in each task. It integrated more genomic measures with greater patient coverage compared to other network integrative methods. Overall, MOGDx is a promising tool for integrating multi-omic data, classifying heterogeneous diseases, and aiding interpretation of genomic marker data.
MOGDx source code is available from https://github.com/biomedicalinformaticsgroup/MOGDx.
由于表现和症状的广泛范围,人类疾病的异质性给诊断和治疗带来了挑战。随着标记多组学数据的快速发展,整合机器学习方法通过在更细粒度的水平上重新定义这些疾病,在治疗方面取得了突破。这些方法在可扩展性、过度简化和处理缺失数据方面往往存在局限性。
在这项研究中,我们引入了多组学图诊断(MOGDx),这是一种灵活的命令行工具,用于整合多组学数据,以执行异质疾病的分类任务。MOGDx 具有网络分类法。它融合了患者相似性网络,用基因组数据的简化向量表示来扩充这个综合网络,并使用图卷积网络进行分类。我们在三个来自癌症基因组图谱的数据集上评估了 MOGDx,用于乳腺浸润性癌、肾癌和低级别神经胶质瘤。MOGDx 展示了最先进的性能和在每个任务中识别相关多组学标记的能力。与其他网络综合方法相比,它集成了更多的基因组测量值,并覆盖了更多的患者。总体而言,MOGDx 是一个有前途的工具,用于整合多组学数据、分类异质疾病,并帮助解释基因组标记数据。
MOGDx 的源代码可从 https://github.com/biomedicalinformaticsgroup/MOGDx 获得。