Kesimoglu Ziynet Nesibe, Bozdag Serdar
Department of Computer Science and Engineering, University of North Texas, Denton, TX, USA.
Department of Mathematics, University of North Texas, Denton, TX, USA.
NAR Genom Bioinform. 2023 Jun 28;5(2):lqad063. doi: 10.1093/nargab/lqad063. eCollection 2023 Jun.
To pave the road towards precision medicine in cancer, patients with similar biology ought to be grouped into same cancer subtypes. Utilizing high-dimensional multiomics datasets, integrative approaches have been developed to uncover cancer subtypes. Recently, Graph Neural Networks have been discovered to learn node embeddings utilizing node features and associations on graph-structured data. Some integrative prediction tools have been developed leveraging these advances on multiple networks with some limitations. Addressing these limitations, we developed SUPREME, a node classification framework, which integrates multiple data modalities on graph-structured data. On breast cancer subtyping, unlike existing tools, SUPREME generates patient embeddings from multiple similarity networks utilizing multiomics features and integrates them with raw features to capture complementary signals. On breast cancer subtype prediction tasks from three datasets, SUPREME outperformed other tools. SUPREME-inferred subtypes had significant survival differences, mostly having more significance than ground truth, and outperformed nine other approaches. These results suggest that with proper multiomics data utilization, SUPREME could demystify undiscovered characteristics in cancer subtypes that cause significant survival differences and could improve ground truth label, which depends mainly on one datatype. In addition, to show model-agnostic property of SUPREME, we applied it to two additional datasets and had a clear outperformance.
为了铺平癌症精准医学的道路,具有相似生物学特性的患者应被归为同一癌症亚型。利用高维多组学数据集,已经开发出整合方法来揭示癌症亚型。最近,人们发现图神经网络可以利用图结构数据上的节点特征和关联来学习节点嵌入。一些整合预测工具已经利用这些在多个网络上的进展开发出来,但存在一些局限性。为了解决这些局限性,我们开发了SUPREME,一个节点分类框架,它在图结构数据上整合了多种数据模式。在乳腺癌亚型分类方面,与现有工具不同,SUPREME利用多组学特征从多个相似性网络生成患者嵌入,并将它们与原始特征整合以捕获互补信号。在来自三个数据集的乳腺癌亚型预测任务中,SUPREME优于其他工具。SUPREME推断的亚型具有显著的生存差异,大多比真实情况更具显著性,并且优于其他九种方法。这些结果表明,通过适当利用多组学数据,SUPREME可以揭开导致显著生存差异的癌症亚型中未被发现的特征的神秘面纱,并可以改进主要依赖于一种数据类型的真实标签。此外,为了展示SUPREME的模型无关属性,我们将其应用于另外两个数据集并取得了明显的优势。