Suppr超能文献

基于机器学习和统计学的生物信息学模型鉴定 IgA 肾病的关键候选基因。

Identification of key candidate genes for IgA nephropathy using machine learning and statistics based bioinformatics models.

机构信息

School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, Fukushima, 965-8580, Japan.

Statistics Discipline, Khulna University, Khulna, 9208, Bangladesh.

出版信息

Sci Rep. 2022 Aug 17;12(1):13963. doi: 10.1038/s41598-022-18273-x.

Abstract

Immunoglobulin-A-nephropathy (IgAN) is a kidney disease caused by the accumulation of IgAN deposits in the kidneys, which causes inflammation and damage to the kidney tissues. Various bioinformatics analysis-based approaches are widely used to predict novel candidate genes and pathways associated with IgAN. However, there is still some scope to clearly explore the molecular mechanisms and causes of IgAN development and progression. Therefore, the present study aimed to identify key candidate genes for IgAN using machine learning (ML) and statistics-based bioinformatics models. First, differentially expressed genes (DEGs) were identified using limma, and then enrichment analysis was performed on DEGs using DAVID. Protein-protein interaction (PPI) was constructed using STRING and Cytoscape was used to determine hub genes based on connectivity and hub modules based on MCODE scores and their associated genes from DEGs. Furthermore, ML-based algorithms, namely support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), and partial least square discriminant analysis (PLS-DA) were applied to identify the discriminative genes of IgAN from DEGs. Finally, the key candidate genes (FOS, JUN, EGR1, FOSB, and DUSP1) were identified as overlapping genes among the selected hub genes, hub module genes, and discriminative genes from SVM, LASSO, and PLS-DA, respectively which can be used for the diagnosis and treatment of IgAN.

摘要

免疫球蛋白 A 肾病(IgAN)是一种由 IgAN 沉积物在肾脏中积累引起的肾脏疾病,导致肾脏组织的炎症和损伤。各种基于生物信息学分析的方法被广泛用于预测与 IgAN 相关的新候选基因和途径。然而,仍有一些空间可以清楚地探索 IgAN 发展和进展的分子机制和原因。因此,本研究旨在使用机器学习(ML)和基于统计学的生物信息学模型来识别 IgAN 的关键候选基因。首先,使用 limma 鉴定差异表达基因(DEGs),然后使用 DAVID 对 DEGs 进行富集分析。使用 STRING 构建蛋白质-蛋白质相互作用(PPI),并使用 Cytoscape 根据连通性确定枢纽基因,根据 MCODE 分数及其与 DEGs 相关的基因确定枢纽模块。此外,应用基于 ML 的算法,即支持向量机(SVM)、最小绝对收缩和选择算子(LASSO)和偏最小二乘判别分析(PLS-DA),从 DEGs 中识别 IgAN 的鉴别基因。最后,从 SVM、LASSO 和 PLS-DA 中分别选择枢纽基因、枢纽模块基因和鉴别基因的重叠基因(FOS、JUN、EGR1、FOSB 和 DUSP1)被鉴定为 IgAN 的关键候选基因,可用于 IgAN 的诊断和治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/40d0/9385868/e2a197168d73/41598_2022_18273_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验