Suppr超能文献

MODIG:一种基于注意力机制的癌症驱动基因识别方法。

MODIG: An Attention Mechanism-Based Approach to Cancer Driver Gene Identification.

作者信息

Zhao Wenyi, Zhou Zhan

机构信息

State Key Laboratory of Advanced Drug Delivery and Release Systems & Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.

The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, China.

出版信息

Methods Mol Biol. 2025;2932:247-257. doi: 10.1007/978-1-0716-4566-6_13.

Abstract

Identifying genes that play a causal role in carcinogenesis remains one of the major challenges in cancer biology. With the accumulation of high-throughput multi-omics data over decades, it has become a great challenge to effectively integrate these data into the identification of cancer driver genes. Here, we propose MODIG, a graph attention network (GAT)-based framework, to identify cancer driver genes by combining multi-omics pan-cancer data (mutations, copy number variants, gene expression, and methylation levels) with multidimensional gene networks. Among them, the multidimensional gene network is constructed by using genes as nodes and five types of gene associations (protein-protein interaction, gene sequence similarity, KEGG pathway co-occurrence, gene co-expression patterns, and gene ontology terms) as multiplex edges. We apply a GAT encoder to model within-dimension interactions to generate a gene representation for each dimension based on this graph, introduce a joint learning module to fuse multiple dimension-specific representations to generate general gene representations, and use the obtained gene representation to perform a semi-supervised driver gene identification task. The MODIG program is available at https://github.com/zjupgx/modig . The code and data are also available on Zenodo, at https://doi.org/10.5281/zenodo.7057241 .

摘要

识别在致癌过程中起因果作用的基因仍然是癌症生物学中的主要挑战之一。随着数十年来高通量多组学数据的积累,将这些数据有效整合到癌症驱动基因的识别中已成为一项巨大挑战。在此,我们提出了MODIG,一个基于图注意力网络(GAT)的框架,通过将多组学泛癌数据(突变、拷贝数变异、基因表达和甲基化水平)与多维基因网络相结合来识别癌症驱动基因。其中,多维基因网络以基因为节点,以五种基因关联(蛋白质-蛋白质相互作用、基因序列相似性、KEGG通路共现、基因共表达模式和基因本体术语)作为多重边构建而成。我们应用GAT编码器对维度内的相互作用进行建模,以基于此图为每个维度生成基因表示,引入联合学习模块来融合多个特定维度的表示以生成通用基因表示,并使用获得的基因表示执行半监督驱动基因识别任务。MODIG程序可在https://github.com/zjupgx/modig获取。代码和数据也可在Zenodo上获取,网址为https://doi.org/10.5281/zenodo.7057241

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验