Suppr超能文献

GNNMutation:一种基于异构图的癌症检测框架。

GNNMutation: a heterogeneous graph-based framework for cancer detection.

作者信息

Özcan Şimşek Nuriye Özlem, Özgür Arzucan, Gürgen Fikret

机构信息

Department of Computer Engineering, Boğaziçi University, Bebek, İstanbul, 34342, Turkey.

出版信息

BMC Bioinformatics. 2025 Jun 4;26(1):153. doi: 10.1186/s12859-025-06133-0.

Abstract

BACKGROUND

When genes are translated into proteins, mutations in the gene sequence can lead to changes in protein structure and function as well as in the interactions between proteins. These changes can disrupt cell function and contribute to the development of tumors. In this study, we introduce a novel approach based on graph neural networks that jointly considers genetic mutations and protein interactions for cancer prediction. We use DNA mutations in whole exome sequencing data and construct a heterogeneous graph in which patients and proteins are represented as nodes and protein-protein interactions as edges. Furthermore, patient nodes are connected to protein nodes based on mutations in the patient's DNA. Each patient node is represented by a feature vector derived from the mutations in specific genes. The feature values are calculated using a weighting scheme inspired by information retrieval, where whole genomes are treated as documents and mutations as words within these documents. The weighting of each gene, determined by its mutations, reflects its contribution to disease development. The patient nodes are updated by both mutations and protein interactions within our noval heterogeneous graph structure. Since the effects of each mutation on disease development are different, we processed the input graph with attention-based graph neural networks.

RESULTS

We compiled a dataset from the UKBiobank consisting of patients with a cancer diagnosis as the case group and those without a cancer diagnosis as the control group. We evaluated our approach for the four most common cancer types, which are breast, prostate, lung and colon cancer, and showed that the proposed framework effectively discriminates between case and control groups.

CONCLUSIONS

The results indicate that our proposed graph structure and node updating strategy improve cancer classification performance. Additionally, we extended our system with an explainer that identifies a list of causal genes which are effective in the model's cancer diagnosis predictions. Notably, some of these genes have already been studied in cancer research, demonstrating the system's ability to recognize causal genes for the selected cancer types and make predictions based on them.

摘要

背景

当基因被翻译成蛋白质时,基因序列中的突变会导致蛋白质结构和功能以及蛋白质之间相互作用的变化。这些变化会破坏细胞功能并促进肿瘤的发展。在本研究中,我们引入了一种基于图神经网络的新方法,该方法联合考虑基因突变和蛋白质相互作用以进行癌症预测。我们使用全外显子组测序数据中的DNA突变,并构建一个异构图,其中患者和蛋白质被表示为节点,蛋白质-蛋白质相互作用被表示为边。此外,患者节点根据患者DNA中的突变与蛋白质节点相连。每个患者节点由从特定基因中的突变衍生而来的特征向量表示。特征值使用受信息检索启发的加权方案计算,其中将整个基因组视为文档,将突变视为这些文档中的单词。由其突变确定的每个基因的权重反映了其对疾病发展的贡献。在我们新颖的异构图结构中,患者节点通过突变和蛋白质相互作用进行更新。由于每个突变对疾病发展的影响不同,我们使用基于注意力的图神经网络处理输入图。

结果

我们从英国生物银行编译了一个数据集,其中包括癌症诊断患者作为病例组和无癌症诊断患者作为对照组。我们评估了我们的方法对四种最常见的癌症类型,即乳腺癌、前列腺癌、肺癌和结肠癌的效果,并表明所提出的框架有效地区分了病例组和对照组。

结论

结果表明,我们提出的图结构和节点更新策略提高了癌症分类性能。此外,我们用一个解释器扩展了我们的系统,该解释器识别出在模型的癌症诊断预测中有效的因果基因列表。值得注意的是,其中一些基因已经在癌症研究中得到研究,证明了该系统能够识别所选癌症类型的因果基因并基于它们进行预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7245/12139269/abb840f26c4d/12859_2025_6133_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验