Pseudo2GO：一种基于图的深度学习方法，通过借鉴编码基因的信息进行假基因功能预测。

Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes.

作者信息

Fan Kunjie, Zhang Yan

机构信息

Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States.

The Ohio State University Comprehensive Cancer Center, Columbus, OH, United States.

出版信息

Front Genet. 2020 Aug 18;11:807. doi: 10.3389/fgene.2020.00807. eCollection 2020.

DOI:10.3389/fgene.2020.00807

PMID:33014009

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7461887/

Abstract

Pseudogenes are indicating more and more functional potentials recently, though historically were regarded as relics of evolution. Computational methods for predicting pseudogene functions on Gene Ontology is important for directing experimental discovery. However, no pseudogene-specific computational methods have been proposed to directly predict their Gene Ontology (GO) terms. The biggest challenge for pseudogene function prediction is the lack of enough features and functional annotations, making training a predictive model difficult. Considering the close functional similarity between pseudogenes and their parent coding genes that share great amount of DNA sequence, as well as that coding genes have rich annotations, we aim to predict pseudogene functions by borrowing information from coding genes in a graph-based way. Here we propose Pseudo2GO, a graph-based deep learning semi-supervised model for pseudogene function prediction. A sequence similarity graph is first constructed to connect pseudogenes and coding genes. Multiple features are incorporated into the model as the node attributes to enable the graph an attributed graph, including expression profiles, interactions with microRNAs, protein-protein interactions (PPIs), and genetic interactions. Graph convolutional networks are used to propagate node attributes across the graph to make classifications on pseudogenes. Comparing Pseudo2GO with other frameworks adapted from popular protein function prediction methods, we demonstrated that our method has achieved state-of-the-art performance, significantly outperforming other methods in terms of the M-AUPR metric.

摘要

假基因虽然在历史上被视为进化的遗迹，但近年来越来越显示出更多的功能潜力。基于基因本体论预测假基因功能的计算方法对于指导实验发现很重要。然而，尚未提出直接预测其基因本体论（GO）术语的假基因特异性计算方法。假基因功能预测面临的最大挑战是缺乏足够的特征和功能注释，这使得训练预测模型变得困难。考虑到假基因与其共享大量DNA序列的亲本编码基因之间密切的功能相似性，以及编码基因具有丰富的注释，我们旨在通过基于图的方式从编码基因中借用信息来预测假基因功能。在此，我们提出了Pseudo2GO，一种用于假基因功能预测的基于图的深度学习半监督模型。首先构建一个序列相似性图来连接假基因和编码基因。多个特征作为节点属性被纳入模型，以使该图成为一个属性图，包括表达谱、与微小RNA的相互作用、蛋白质-蛋白质相互作用（PPI）和遗传相互作用。图卷积网络用于在图中传播节点属性，以便对假基因进行分类。将Pseudo2GO与其他改编自流行蛋白质功能预测方法的框架进行比较，我们证明了我们的方法取得了领先的性能，在M-AUPR指标方面显著优于其他方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a797/7461887/afea05d32d1f/fgene-11-00807-g0001.jpg

相似文献

Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes.Pseudo2GO：一种基于图的深度学习方法，通过借鉴编码基因的信息进行假基因功能预测。

Front Genet. 2020 Aug 18;11:807. doi: 10.3389/fgene.2020.00807. eCollection 2020.

PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers.伪 FuN：从 32 种癌症中与基因和 microRNAs 的整合关系中推导出假基因的功能潜力。

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz046.

Predicting Pseudogene-miRNA Associations Based on Feature Fusion and Graph Auto-Encoder.基于特征融合和图自动编码器预测假基因- microRNA关联

Front Genet. 2021 Dec 13;12:781277. doi: 10.3389/fgene.2021.781277. eCollection 2021.

A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning.一种使用深度图卷积网络和半监督学习的新型候选疾病基因优先级排序方法。

BMC Bioinformatics. 2022 Oct 14;23(1):422. doi: 10.1186/s12859-022-04954-x.

Network analysis of pseudogene-gene relationships: from pseudogene evolution to their functional potentials.假基因-基因关系的网络分析：从假基因进化到其功能潜力

Pac Symp Biocomput. 2018;23:536-547.

Inferring pseudogene-MiRNA associations based on an ensemble learning framework with similarity kernel fusion.基于集成学习框架和相似性核融合的假基因-miRNA 关联推断。

Sci Rep. 2023 May 31;13(1):8833. doi: 10.1038/s41598-023-36054-y.

Predicting functions of maize proteins using graph convolutional network.利用图卷积网络预测玉米蛋白的功能。

BMC Bioinformatics. 2020 Dec 16;21(Suppl 16):420. doi: 10.1186/s12859-020-03745-6.

Graph generative and adversarial strategy-enhanced node feature learning and self-calibrated pairwise attribute encoding for prediction of drug-related side effects.用于预测药物相关副作用的图生成与对抗策略增强的节点特征学习及自校准成对属性编码

Front Pharmacol. 2023 Sep 4;14:1257842. doi: 10.3389/fphar.2023.1257842. eCollection 2023.

Graph-based prediction of Protein-protein interactions with attributed signed graph embedding.基于属性有向图嵌入的蛋白质-蛋白质相互作用的图预测。

BMC Bioinformatics. 2020 Jul 21;21(1):323. doi: 10.1186/s12859-020-03646-8.

MAMF-GCN: Multi-scale adaptive multi-channel fusion deep graph convolutional network for predicting mental disorder.MAMF-GCN：用于预测精神障碍的多尺度自适应多通道融合深度图卷积网络。

Comput Biol Med. 2022 Sep;148:105823. doi: 10.1016/j.compbiomed.2022.105823. Epub 2022 Jul 6.

引用本文的文献

Pseudogenes as Potential Diagnostic, Prognostic and Therapeutic Biomarkers in Colorectal Cancer: A Systematic Review.假基因作为结直肠癌潜在的诊断、预后和治疗生物标志物：一项系统综述

Cancer Rep (Hoboken). 2025 Jun;8(6):e70263. doi: 10.1002/cnr2.70263.

DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models.DNA序列分析全景：对DNA序列分析任务类型、数据库、数据集、词嵌入方法和语言模型的全面综述。

Front Med (Lausanne). 2025 Apr 8;12:1503229. doi: 10.3389/fmed.2025.1503229. eCollection 2025.

Graph representation learning in biomedicine and healthcare.生物医学和医疗保健中的图表示学习。

Nat Biomed Eng. 2022 Dec;6(12):1353-1369. doi: 10.1038/s41551-022-00942-x. Epub 2022 Oct 31.

PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment.PPA-GCN：一种用于原核生物途径分配的高效图卷积网络框架。

Front Genet. 2022 Apr 4;13:839453. doi: 10.3389/fgene.2022.839453. eCollection 2022.

Topological network measures for drug repositioning.拓扑网络度量在药物重定位中的应用。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa357.

本文引用的文献

A Comprehensive Survey on Graph Neural Networks.图神经网络综述。

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.

Deep graph embedding for prioritizing synergistic anticancer drug combinations.用于优先排序协同抗癌药物组合的深度图嵌入

Comput Struct Biotechnol J. 2020 Feb 15;18:427-438. doi: 10.1016/j.csbj.2020.02.006. eCollection 2020.

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz046.

Multi-View Graph Convolutional Network and Its Applications on Neuroimage Analysis for Parkinson's Disease.多视图图卷积网络及其在帕金森病神经影像分析中的应用

AMIA Annu Symp Proc. 2018 Dec 5;2018:1147-1156. eCollection 2018.

The BioGRID interaction database: 2019 update.生物相互作用数据库（BioGRID）：2019 年更新版。

Nucleic Acids Res. 2019 Jan 8;47(D1):D529-D541. doi: 10.1093/nar/gky1079.

The Gene Ontology Resource: 20 years and still GOing strong.《基因本体论资源：20 年，持续强大》

Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. doi: 10.1093/nar/gky1055.

GENCODE reference annotation for the human and mouse genomes.GENCODE 人类和小鼠基因组参考注释。

Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773. doi: 10.1093/nar/gky955.

Modeling polypharmacy side effects with graph convolutional networks.基于图卷积网络的药物滥用副作用建模。

Bioinformatics. 2018 Jul 1;34(13):i457-i466. doi: 10.1093/bioinformatics/bty294.

deepNF: deep network fusion for protein function prediction.深度网络融合的蛋白质功能预测。

Bioinformatics. 2018 Nov 15;34(22):3873-3881. doi: 10.1093/bioinformatics/bty440.

Noncoding RNA:RNA Regulatory Networks in Cancer.非编码 RNA：癌症中的 RNA 调控网络。

Int J Mol Sci. 2018 Apr 27;19(5):1310. doi: 10.3390/ijms19051310.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Pseudo2GO：一种基于图的深度学习方法，通过借鉴编码基因的信息进行假基因功能预测。

Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献