Suppr超能文献

基于知识的机器学习方法发现疾病间基因关联的 GediNET。

GediNET for discovering gene associations across diseases using knowledge based machine learning approach.

机构信息

Information Technology Engineering, Al-Quds University, Abu Dis, Palestine.

The Wistar Institute, Philadelphia, PA, 19104, USA.

出版信息

Sci Rep. 2022 Nov 19;12(1):19955. doi: 10.1038/s41598-022-24421-0.

Abstract

The most common approaches to discovering genes associated with specific diseases are based on machine learning and use a variety of feature selection techniques to identify significant genes that can serve as biomarkers for a given disease. More recently, the integration in this process of prior knowledge-based approaches has shown significant promise in the discovery of new biomarkers with potential translational applications. In this study, we developed a novel approach, GediNET, that integrates prior biological knowledge to gene Groups that are shown to be associated with a specific disease such as a cancer. The novelty of GediNET is that it then also allows the discovery of significant associations between that specific disease and other diseases. The initial step in this process involves the identification of gene Groups. The Groups are then subjected to a Scoring component to identify the top performing classification Groups. The top-ranked gene Groups are then used to train a Machine Learning Model. The process of Grouping, Scoring and Modelling (G-S-M) is used by GediNET to identify other diseases that are similarly associated with this signature. GediNET identifies these relationships through Disease-Disease Association (DDA) based machine learning. DDA explores novel associations between diseases and identifies relationships which could be used to further improve approaches to diagnosis, prognosis, and treatment. The GediNET KNIME workflow can be downloaded from: https://github.com/malikyousef/GediNET.git or https://kni.me/w/3kH1SQV_mMUsMTS .

摘要

发现与特定疾病相关基因的最常见方法是基于机器学习,并使用各种特征选择技术来识别可作为给定疾病生物标志物的显著基因。最近,在这个过程中整合基于先验知识的方法在发现具有潜在转化应用的新生物标志物方面显示出了很大的前景。在这项研究中,我们开发了一种新方法 GediNET,该方法将先验生物学知识整合到与特定疾病(如癌症)相关的基因组中。GediNET 的新颖之处在于,它还可以发现特定疾病与其他疾病之间的显著关联。该过程的第一步涉及识别基因组。然后,对这些组进行评分组件分析,以确定表现最佳的分类组。排名最高的基因组随后用于训练机器学习模型。GediNET 通过分组、评分和建模 (G-S-M) 过程来识别与该特征类似的其他疾病。GediNET 通过基于疾病-疾病关联 (DDA) 的机器学习来识别这些关系。DDA 探索了疾病之间的新关联,并确定了可用于进一步改进诊断、预后和治疗方法的关系。GediNET 的 KNIME 工作流程可从以下网址下载:https://github.com/malikyousef/GediNET.githttps://kni.me/w/3kH1SQV_mMUsMTS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7863/9675776/dab2d1d9f14c/41598_2022_24421_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验