Suppr超能文献

一种用于准确预测癌症驱动基因和下游分析的异质图变换框架。

A heterogeneous graph transformer framework for accurate cancer driver gene prediction and downstream analysis.

机构信息

School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China.

Sichuan Institute of Computer Sciences, Chengdu, 610041, China.

出版信息

Methods. 2024 Dec;232:9-17. doi: 10.1016/j.ymeth.2024.09.018. Epub 2024 Oct 18.

Abstract

Accurately predicting cancer driver genes remains a formidable challenge amidst the burgeoning volume and intricacy of cancer genomic data. In this investigation, we propose HGTDG, an innovative heterogeneous graph transformer framework tailored for precisely predicting cancer driver genes and exploring downstream tasks. A heterogeneous graph construction module is central to the framework, which assembles a gene-protein heterogeneous network leveraging the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and protein-protein interactions sourced from the STRING (search tool for recurring instances of neighboring genes) database. Moreover, our framework introduces a pioneering heterogeneous graph transformer module, harnessing multi-head attention mechanisms for nuanced node embedding. This transformative module proficiently captures distinct representations for both nodes and edges, thereby enriching the model's predictive capacity. Subsequently, the generated node embeddings are seamlessly integrated into a classification module, facilitating the discrimination between driver and non-driver genes. Our experimental findings evince the superiority of HGTDG over existing methodologies, as evidenced by the enhanced performance metrics, including the area under the receiver operating characteristic curves (AUROC) and the area under the precision-recall curves (AUPRC). Furthermore, the downstream analysis utilizing the newly identified cancer driver genes underscores the efficacy and versatility of our proposed framework.

摘要

在癌症基因组数据的数量和复杂性不断增长的情况下,准确预测癌症驱动基因仍然是一个艰巨的挑战。在这项研究中,我们提出了 HGTDG,这是一个针对准确预测癌症驱动基因和探索下游任务的创新异构图转换器框架。一个异构图构建模块是该框架的核心,它利用京都基因与基因组百科全书(KEGG)途径和来源于 STRING(搜索基因邻域实例的工具)数据库的蛋白质-蛋白质相互作用组装一个基因-蛋白质异构网络。此外,我们的框架引入了一个开创性的异构图转换器模块,利用多头注意力机制对节点进行细致的嵌入。这个转换模块能够熟练地捕捉节点和边的不同表示,从而增强模型的预测能力。随后,生成的节点嵌入被无缝地集成到分类模块中,有助于区分驱动基因和非驱动基因。我们的实验结果表明,HGTDG 优于现有的方法,这体现在性能指标的提高上,包括接收器操作特征曲线下的面积(AUROC)和精度-召回曲线下的面积(AUPRC)。此外,利用新确定的癌症驱动基因进行的下游分析突出了我们提出的框架的有效性和多功能性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验