Suppr超能文献

PONYTA:基于生物网络的 PU 学习从小鼠 KO 事件中优先选择表型相关基因。

PONYTA: prioritization of phenotype-related genes from mouse KO events using PU learning on a biological network.

机构信息

Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Republic of Korea.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea.

出版信息

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae634.

Abstract

MOTIVATION

Transcriptome data from gene knock-out (KO) experiments in mice provide crucial insights into the intricate interactions between genotype and phenotype. Differentially expressed gene (DEG) analysis and network propagation (NP) are well-established methods for analysing transcriptome data. To determine genes related to phenotype changes from a KO experiment, we need to choose a cutoff value for the corresponding criterion based on the specific method. Using a rigorous cutoff value for DEG analysis and NP is likely to select mostly positive genes related to the phenotype, but many will be rejected as false negatives. On the other hand, using a loose cutoff value for either method is prone to include a number of genes that are not phenotype-related, which are false positives. Thus, the research problem at hand is how to deal with the trade-off between false negatives and false positives.

RESULTS

We propose a novel framework called PONYTA for gene prioritization via positive-unlabeled (PU) learning on biological networks. Beginning with the selection of true phenotype-related genes using a rigorous cutoff value for DEG analysis and NP, we address the issue of handling false negatives by rescuing them through PU learning. Evaluations on transcriptome data from multiple studies show that our approach has superior gene prioritization ability compared to benchmark models. Therefore, PONYTA effectively prioritizes genes related to phenotypes derived from gene KO events and guides in vitro and in vivo gene KO experiments for increased efficiency.

AVAILABILITY AND IMPLEMENTATION

The source code of PONYTA is available at https://github.com/Jun-Hyeong-Kim/PONYTA.

摘要

动机

来自基因敲除 (KO) 实验的转录组数据为基因型和表型之间的复杂相互作用提供了重要的见解。差异表达基因 (DEG) 分析和网络传播 (NP) 是分析转录组数据的成熟方法。为了从 KO 实验中确定与表型变化相关的基因,我们需要根据特定的方法选择相应标准的截止值。使用严格的 DEG 分析和 NP 截止值可能会选择与表型相关的大多数阳性基因,但许多基因会被拒绝为假阴性。另一方面,使用宽松的截止值对于任何一种方法都容易包含许多与表型无关的基因,这些基因是假阳性。因此,当前的研究问题是如何在假阴性和假阳性之间进行权衡。

结果

我们提出了一种称为 PONYTA 的新框架,用于通过生物网络上的正无标签 (PU) 学习进行基因优先级排序。从使用 DEG 分析和 NP 的严格截止值选择真正与表型相关的基因开始,我们通过 PU 学习来解决处理假阴性的问题。对来自多个研究的转录组数据的评估表明,与基准模型相比,我们的方法具有卓越的基因优先级排序能力。因此,PONTA 有效地优先考虑了源自基因 KO 事件的表型相关基因,并指导体外和体内基因 KO 实验以提高效率。

可用性和实现

PONYTA 的源代码可在 https://github.com/Jun-Hyeong-Kim/PONYTA 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/586c/11561041/7aaf34a017b5/btae634f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验