• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于知识的机器学习方法发现疾病间基因关联的 GediNET。

GediNET for discovering gene associations across diseases using knowledge based machine learning approach.

机构信息

Information Technology Engineering, Al-Quds University, Abu Dis, Palestine.

The Wistar Institute, Philadelphia, PA, 19104, USA.

出版信息

Sci Rep. 2022 Nov 19;12(1):19955. doi: 10.1038/s41598-022-24421-0.

DOI:10.1038/s41598-022-24421-0
PMID:36402891
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9675776/
Abstract

The most common approaches to discovering genes associated with specific diseases are based on machine learning and use a variety of feature selection techniques to identify significant genes that can serve as biomarkers for a given disease. More recently, the integration in this process of prior knowledge-based approaches has shown significant promise in the discovery of new biomarkers with potential translational applications. In this study, we developed a novel approach, GediNET, that integrates prior biological knowledge to gene Groups that are shown to be associated with a specific disease such as a cancer. The novelty of GediNET is that it then also allows the discovery of significant associations between that specific disease and other diseases. The initial step in this process involves the identification of gene Groups. The Groups are then subjected to a Scoring component to identify the top performing classification Groups. The top-ranked gene Groups are then used to train a Machine Learning Model. The process of Grouping, Scoring and Modelling (G-S-M) is used by GediNET to identify other diseases that are similarly associated with this signature. GediNET identifies these relationships through Disease-Disease Association (DDA) based machine learning. DDA explores novel associations between diseases and identifies relationships which could be used to further improve approaches to diagnosis, prognosis, and treatment. The GediNET KNIME workflow can be downloaded from: https://github.com/malikyousef/GediNET.git or https://kni.me/w/3kH1SQV_mMUsMTS .

摘要

发现与特定疾病相关基因的最常见方法是基于机器学习,并使用各种特征选择技术来识别可作为给定疾病生物标志物的显著基因。最近,在这个过程中整合基于先验知识的方法在发现具有潜在转化应用的新生物标志物方面显示出了很大的前景。在这项研究中,我们开发了一种新方法 GediNET,该方法将先验生物学知识整合到与特定疾病(如癌症)相关的基因组中。GediNET 的新颖之处在于,它还可以发现特定疾病与其他疾病之间的显著关联。该过程的第一步涉及识别基因组。然后,对这些组进行评分组件分析,以确定表现最佳的分类组。排名最高的基因组随后用于训练机器学习模型。GediNET 通过分组、评分和建模 (G-S-M) 过程来识别与该特征类似的其他疾病。GediNET 通过基于疾病-疾病关联 (DDA) 的机器学习来识别这些关系。DDA 探索了疾病之间的新关联,并确定了可用于进一步改进诊断、预后和治疗方法的关系。GediNET 的 KNIME 工作流程可从以下网址下载:https://github.com/malikyousef/GediNET.git 或 https://kni.me/w/3kH1SQV_mMUsMTS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/1887d8cd3b06/41598_2022_24421_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/dab2d1d9f14c/41598_2022_24421_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/af6a47ff67bc/41598_2022_24421_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/1c7f15e71d7b/41598_2022_24421_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/436be7086abd/41598_2022_24421_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/d8a098d77f02/41598_2022_24421_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/2060c12806eb/41598_2022_24421_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/9c441cbfb860/41598_2022_24421_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/e370c9f5ceef/41598_2022_24421_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/92ede39b4c2a/41598_2022_24421_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/9a2531effe8b/41598_2022_24421_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/1887d8cd3b06/41598_2022_24421_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/dab2d1d9f14c/41598_2022_24421_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/af6a47ff67bc/41598_2022_24421_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/1c7f15e71d7b/41598_2022_24421_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/436be7086abd/41598_2022_24421_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/d8a098d77f02/41598_2022_24421_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/2060c12806eb/41598_2022_24421_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/9c441cbfb860/41598_2022_24421_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/e370c9f5ceef/41598_2022_24421_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/92ede39b4c2a/41598_2022_24421_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/9a2531effe8b/41598_2022_24421_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/1887d8cd3b06/41598_2022_24421_Fig11_HTML.jpg

相似文献

1
GediNET for discovering gene associations across diseases using knowledge based machine learning approach.基于知识的机器学习方法发现疾病间基因关联的 GediNET。
Sci Rep. 2022 Nov 19;12(1):19955. doi: 10.1038/s41598-022-24421-0.
2
miRdisNET: Discovering microRNA biomarkers that are associated with diseases utilizing biological knowledge-based machine learning.miRdisNET:利用基于生物学知识的机器学习发现与疾病相关的微小RNA生物标志物。
Front Genet. 2023 Jan 12;13:1076554. doi: 10.3389/fgene.2022.1076554. eCollection 2022.
3
microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.微生物群落GSM:使用分组、评分和建模(G-S-M)方法从宏基因组数据中识别分类学生物标志物。
Front Microbiol. 2023 Nov 22;14:1264941. doi: 10.3389/fmicb.2023.1264941. eCollection 2023.
4
Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods.基于稳健机器学习-递归特征消除方法的基因表达数据的稳健生物标志物筛选。
Comput Biol Chem. 2022 Oct;100:107747. doi: 10.1016/j.compbiolchem.2022.107747. Epub 2022 Jul 29.
5
GeNetOntology: identifying affected gene ontology terms via grouping, scoring, and modeling of gene expression data utilizing biological knowledge-based machine learning.基因本体论:通过利用基于生物知识的机器学习对基因表达数据进行分组、评分和建模来识别受影响的基因本体术语。
Front Genet. 2023 Aug 21;14:1139082. doi: 10.3389/fgene.2023.1139082. eCollection 2023.
6
Empowering the discovery of novel target-disease associations via machine learning approaches in the open targets platform.通过开放靶点平台中的机器学习方法赋予发现新的靶-疾病关联的能力。
BMC Bioinformatics. 2022 Jun 16;23(1):232. doi: 10.1186/s12859-022-04753-4.
7
miRModuleNet: Detecting miRNA-mRNA Regulatory Modules.miRModuleNet:检测微小RNA-信使核糖核酸调控模块
Front Genet. 2022 Apr 12;13:767455. doi: 10.3389/fgene.2022.767455. eCollection 2022.
8
A Knowledge-Based Machine Learning Approach to Gene Prioritisation in Amyotrophic Lateral Sclerosis.基于知识的机器学习方法在肌萎缩侧索硬化症中的基因优先级排序。
Genes (Basel). 2020 Jun 19;11(6):668. doi: 10.3390/genes11060668.
9
Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology.利用机器学习和基因功能相似性(通过基因本体论评估)识别疾病基因。
PLoS One. 2018 Dec 10;13(12):e0208626. doi: 10.1371/journal.pone.0208626. eCollection 2018.
10
Robust biomarker discovery for hepatocellular carcinoma from high-throughput data by multiple feature selection methods.通过多种特征选择方法从高通量数据中发现用于肝细胞癌的稳健生物标志物。
BMC Med Genomics. 2021 Aug 25;14(Suppl 1):112. doi: 10.1186/s12920-021-00957-4.

引用本文的文献

1
Proteins Combined Score Prediction Based on Improved Gene Expression Programming Algorithm and Protein-Protein Interaction Network Characterization.基于改进基因表达编程算法和蛋白质-蛋白质相互作用网络特征的蛋白质综合评分预测
IET Syst Biol. 2025 Jan-Dec;19(1):e70024. doi: 10.1049/syb2.70024.
2
KG2ML: Integrating Knowledge Graphs and Positive Unlabeled Learning for Identifying Disease-Associated Genes.KG2ML:整合知识图谱与正例无标注学习以识别疾病相关基因
medRxiv. 2025 Mar 17:2025.03.17.25323906. doi: 10.1101/2025.03.17.25323906.
3
RCE-IFE: recursive cluster elimination with intra-cluster feature elimination.

本文引用的文献

1
PriPath: identifying dysregulated pathways from differential gene expression via grouping, scoring, and modeling with an embedded feature selection approach.PriPath:通过分组、评分和建模,并结合嵌入式特征选择方法,从差异基因表达中识别失调途径。
BMC Bioinformatics. 2023 Feb 23;24(1):60. doi: 10.1186/s12859-023-05187-2.
2
TextNetTopics: Text Classification Based Word Grouping as Topics and Topics' Scoring.文本网络主题:基于文本分类的词群分组作为主题及主题评分
Front Genet. 2022 Jun 20;13:893378. doi: 10.3389/fgene.2022.893378. eCollection 2022.
3
miRModuleNet: Detecting miRNA-mRNA Regulatory Modules.
RCE-IFE:带簇内特征消除的递归簇消除
PeerJ Comput Sci. 2025 Feb 7;11:e2528. doi: 10.7717/peerj-cs.2528. eCollection 2025.
4
Topic selection for text classification using ensemble topic modeling with grouping, scoring, and modeling approach.使用具有分组、评分和建模方法的集成主题建模进行文本分类的主题选择
Sci Rep. 2024 Oct 9;14(1):23516. doi: 10.1038/s41598-024-74022-2.
5
SVM-DO: identification of tumor-discriminating mRNA signatures via support vector machines supported by Disease Ontology.SVM-DO:通过由疾病本体论支持的支持向量机识别肿瘤鉴别mRNA特征
Turk J Biol. 2023 Dec 14;47(6):349-365. doi: 10.55730/1300-0152.2670. eCollection 2023.
6
microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.微生物群落GSM:使用分组、评分和建模(G-S-M)方法从宏基因组数据中识别分类学生物标志物。
Front Microbiol. 2023 Nov 22;14:1264941. doi: 10.3389/fmicb.2023.1264941. eCollection 2023.
7
TextNetTopics Pro, a topic model-based text classification for short text by integration of semantic and document-topic distribution information.TextNetTopics Pro,一种基于主题模型的短文本分类方法,通过整合语义和文档主题分布信息实现。
Front Genet. 2023 Oct 5;14:1243874. doi: 10.3389/fgene.2023.1243874. eCollection 2023.
8
GeNetOntology: identifying affected gene ontology terms via grouping, scoring, and modeling of gene expression data utilizing biological knowledge-based machine learning.基因本体论:通过利用基于生物知识的机器学习对基因表达数据进行分组、评分和建模来识别受影响的基因本体术语。
Front Genet. 2023 Aug 21;14:1139082. doi: 10.3389/fgene.2023.1139082. eCollection 2023.
9
Review of feature selection approaches based on grouping of features.基于特征分组的特征选择方法综述。
PeerJ. 2023 Jul 17;11:e15666. doi: 10.7717/peerj.15666. eCollection 2023.
10
Invention of 3Mint for feature grouping and scoring in multi-omics.用于多组学中特征分组和评分的3Mint的发明。
Front Genet. 2023 Mar 15;14:1093326. doi: 10.3389/fgene.2023.1093326. eCollection 2023.
miRModuleNet:检测微小RNA-信使核糖核酸调控模块
Front Genet. 2022 Apr 12;13:767455. doi: 10.3389/fgene.2022.767455. eCollection 2022.
4
miRcorrNet: machine learning-based integration of miRNA and mRNA expression profiles, combined with feature grouping and ranking.miRcorrNet:基于机器学习的miRNA和mRNA表达谱整合,结合特征分组与排序
PeerJ. 2021 May 19;9:e11458. doi: 10.7717/peerj.11458. eCollection 2021.
5
CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis.CogNet:基于面向排名活性子网的KEGG通路富集分析的基因表达数据分类
PeerJ Comput Sci. 2021 Feb 22;7:e336. doi: 10.7717/peerj-cs.336. eCollection 2021.
6
Recursive Cluster Elimination based Rank Function (SVM-RCE-R) implemented in KNIME.基于递归聚类消除的秩函数(SVM-RCE-R)在 KNIME 中的实现。
F1000Res. 2020 Oct 19;9:1255. doi: 10.12688/f1000research.26880.2. eCollection 2020.
7
Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data.基于生物领域知识的特征选择在基因表达数据中的应用。
Entropy (Basel). 2020 Dec 22;23(1):2. doi: 10.3390/e23010002.
8
WikiPathways: connecting communities.维基路径:连接社区。
Nucleic Acids Res. 2021 Jan 8;49(D1):D613-D621. doi: 10.1093/nar/gkaa1024.
9
Ensemble disease gene prediction by clinical sample-based networks.基于临床样本的网络进行疾病基因综合预测。
BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):79. doi: 10.1186/s12859-020-3346-8.
10
Multi-view based integrative analysis of gene expression data for identifying biomarkers.基于多视图的基因表达数据综合分析鉴定生物标志物。
Sci Rep. 2019 Sep 18;9(1):13504. doi: 10.1038/s41598-019-49967-4.