• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过使用机器学习、文本挖掘和基因表达分析来解码糖尿病生物标志物和相关分子机制。

Decoding Diabetes Biomarkers and Related Molecular Mechanisms by Using Machine Learning, Text Mining, and Gene Expression Analysis.

机构信息

Department of Oral Biology, Faculty of Dentistry, Mansoura University, Mansoura 35116, Egypt.

Agricultural Genetic Engineering Research Institute, Agricultural Research Center, Giza 12619, Egypt.

出版信息

Int J Environ Res Public Health. 2022 Oct 26;19(21):13890. doi: 10.3390/ijerph192113890.

DOI:10.3390/ijerph192113890
PMID:36360783
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9656783/
Abstract

The molecular basis of diabetes mellitus is yet to be fully elucidated. We aimed to identify the most frequently reported and differential expressed genes (DEGs) in diabetes by using bioinformatics approaches. Text mining was used to screen 40,225 article abstracts from diabetes literature. These studies highlighted 5939 diabetes-related genes spread across 22 human chromosomes, with 112 genes mentioned in more than 50 studies. Among these genes, , , , , , , , , , and were mentioned in more than 200 articles. These genes are correlated with the regulation of glycogen and polysaccharide, adipogenesis, AGE/RAGE, and macrophage differentiation. Three datasets (44 patients and 57 controls) were subjected to gene expression analysis. The analysis revealed 135 significant DEGs, of which , , , , , , , , and were the top 10 DEGs. These genes were enriched in aerobic respiration, T-cell antigen receptor pathway, tricarboxylic acid metabolic process, vitamin D receptor pathway, toll-like receptor signaling, and endoplasmic reticulum (ER) unfolded protein response. The results of text mining and gene expression analyses used as attribute values for machine learning (ML) analysis. The decision tree, extra-tree regressor and random forest algorithms were used in ML analysis to identify unique markers that could be used as diabetes diagnosis tools. These algorithms produced prediction models with accuracy ranges from 0.6364 to 0.88 and overall confidence interval (CI) of 95%. There were 39 biomarkers that could distinguish diabetic and non-diabetic patients, 12 of which were repeated multiple times. The majority of these genes are associated with stress response, signalling regulation, locomotion, cell motility, growth, and muscle adaptation. Machine learning algorithms highlighted the use of the gene as a biomarker for diabetes early detection. Our data mining and gene expression analysis have provided useful information about potential biomarkers in diabetes.

摘要

糖尿病的分子基础尚未完全阐明。我们旨在通过生物信息学方法鉴定糖尿病中最常报道和差异表达的基因(DEGs)。文本挖掘用于筛选来自糖尿病文献的 40225 篇文章摘要。这些研究突出了分布在 22 个人类染色体上的 5939 个与糖尿病相关的基因,其中 112 个基因在超过 50 项研究中被提及。在这些基因中,、、、、、、、、和在 200 多篇文章中被提及。这些基因与糖原和多糖的调节、脂肪生成、AGE/RAGE 和巨噬细胞分化有关。三个数据集(44 名患者和 57 名对照)进行了基因表达分析。分析显示有 135 个显著的 DEGs,其中、、、、、、、和是前 10 个 DEGs。这些基因在有氧呼吸、T 细胞抗原受体途径、三羧酸代谢过程、维生素 D 受体途径、 Toll 样受体信号和内质网(ER)未折叠蛋白反应中富集。文本挖掘和基因表达分析的结果被用作机器学习(ML)分析的属性值。决策树、Extra-Tree 回归器和随机森林算法用于 ML 分析,以识别可作为糖尿病诊断工具的独特标志物。这些算法生成的预测模型的准确率范围为 0.6364 至 0.88,整体置信区间(CI)为 95%。有 39 个生物标志物可以区分糖尿病患者和非糖尿病患者,其中 12 个标志物被多次重复。这些基因中的大多数与应激反应、信号调节、运动、细胞运动、生长和肌肉适应有关。机器学习算法突出了使用基因作为糖尿病早期检测的生物标志物。我们的数据挖掘和基因表达分析为糖尿病潜在生物标志物提供了有用的信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/e01fb19ad1ab/ijerph-19-13890-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/a76e85fd5dd9/ijerph-19-13890-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/d86e1a3438b0/ijerph-19-13890-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/234724d769cf/ijerph-19-13890-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/74700b1013bb/ijerph-19-13890-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/3465be3dcffa/ijerph-19-13890-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/dc2a7a163ef4/ijerph-19-13890-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/337429b6af25/ijerph-19-13890-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/479e28b14db2/ijerph-19-13890-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/706cfc69e2d4/ijerph-19-13890-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/e01fb19ad1ab/ijerph-19-13890-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/a76e85fd5dd9/ijerph-19-13890-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/d86e1a3438b0/ijerph-19-13890-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/234724d769cf/ijerph-19-13890-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/74700b1013bb/ijerph-19-13890-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/3465be3dcffa/ijerph-19-13890-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/dc2a7a163ef4/ijerph-19-13890-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/337429b6af25/ijerph-19-13890-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/479e28b14db2/ijerph-19-13890-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/706cfc69e2d4/ijerph-19-13890-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b3/9656783/e01fb19ad1ab/ijerph-19-13890-g010.jpg

相似文献

1
Decoding Diabetes Biomarkers and Related Molecular Mechanisms by Using Machine Learning, Text Mining, and Gene Expression Analysis.通过使用机器学习、文本挖掘和基因表达分析来解码糖尿病生物标志物和相关分子机制。
Int J Environ Res Public Health. 2022 Oct 26;19(21):13890. doi: 10.3390/ijerph192113890.
2
Identification of key genes and biological pathways associated with vascular aging in diabetes based on bioinformatics and machine learning.基于生物信息学和机器学习的糖尿病血管老化相关关键基因和生物学通路的鉴定。
Aging (Albany NY). 2024 May 27;16(11):9369-9385. doi: 10.18632/aging.205870.
3
The identification of key genes and pathways in hepatocellular carcinoma by bioinformatics analysis of high-throughput data.通过高通量数据的生物信息学分析鉴定肝细胞癌中的关键基因和信号通路。
Med Oncol. 2017 Jun;34(6):101. doi: 10.1007/s12032-017-0963-9. Epub 2017 Apr 21.
4
Combining bioinformatics and machine learning algorithms to identify and analyze shared biomarkers and pathways in COVID-19 convalescence and diabetes mellitus.结合生物信息学和机器学习算法,鉴定和分析 COVID-19 恢复期和糖尿病共有的生物标志物和通路。
Front Endocrinol (Lausanne). 2023 Dec 19;14:1306325. doi: 10.3389/fendo.2023.1306325. eCollection 2023.
5
Text mining-based identification of promising miRNA biomarkers for diabetes mellitus.基于文本挖掘的糖尿病有前途的 miRNA 生物标志物的鉴定。
Front Endocrinol (Lausanne). 2023 Jul 25;14:1195145. doi: 10.3389/fendo.2023.1195145. eCollection 2023.
6
Identification of immune-related endoplasmic reticulum stress genes in sepsis using bioinformatics and machine learning.使用生物信息学和机器学习鉴定脓毒症相关的内质网应激基因。
Front Immunol. 2022 Sep 20;13:995974. doi: 10.3389/fimmu.2022.995974. eCollection 2022.
7
Bioinformatics prediction and experimental verification of key biomarkers for diabetic kidney disease based on transcriptome sequencing in mice.基于小鼠转录组测序的糖尿病肾病关键生物标志物的生物信息学预测和实验验证。
PeerJ. 2022 Sep 20;10:e13932. doi: 10.7717/peerj.13932. eCollection 2022.
8
Identification of novel biomarkers to distinguish clear cell and non-clear cell renal cell carcinoma using bioinformatics and machine learning.利用生物信息学和机器学习鉴定新型生物标志物以区分肾透明细胞癌和非透明细胞癌。
PLoS One. 2024 Jun 10;19(6):e0305252. doi: 10.1371/journal.pone.0305252. eCollection 2024.
9
A comprehensive analysis of m6A/m7G/m5C/m1A-related gene expression and immune infiltration in liver ischemia-reperfusion injury by integrating bioinformatics and machine learning algorithms.通过整合生物信息学和机器学习算法对肝脏缺血再灌注损伤中m6A/m7G/m5C/m1A相关基因表达及免疫浸润进行综合分析
Eur J Med Res. 2024 Jun 13;29(1):326. doi: 10.1186/s40001-024-01928-y.
10
Exploration of effective biomarkers for venous thrombosis embolism in Behçet's disease based on comprehensive bioinformatics analysis.基于综合生物信息学分析探讨白塞病静脉血栓栓塞的有效生物标志物。
Sci Rep. 2024 Jul 10;14(1):15884. doi: 10.1038/s41598-024-66973-3.

引用本文的文献

1
Diaproteo: A supervised learning framework for early detection of diabetes mellitus based on proteomic profiles.Diaproteo:一种基于蛋白质组学图谱的糖尿病早期检测监督学习框架。
Digit Health. 2025 Jul 30;11:20552076251362281. doi: 10.1177/20552076251362281. eCollection 2025 Jan-Dec.
2
Artificial Intelligence in Diabetes Care: Applications, Challenges, and Opportunities Ahead.糖尿病护理中的人工智能:应用、挑战与未来机遇
Endocr Pract. 2025 Jul 17. doi: 10.1016/j.eprac.2025.07.008.
3
Generalizability of machine learning models for diabetes detection a study with nordic islet transplant and PIMA datasets.

本文引用的文献

1
Preliminary Study of Genome-Wide Association Identified Novel Susceptibility Genes for Hemorheological Indexes in a Chinese Population.全基因组关联研究初步筛选中国人群血液流变学指标新易感基因
Transfus Med Hemother. 2022 Jul 5;49(6):346-357. doi: 10.1159/000524849. eCollection 2022 Dec.
2
The central melanocortin system as a treatment target for obesity and diabetes: A brief overview.作为肥胖和糖尿病治疗靶点的中枢黑皮质素系统:简要概述。
Eur J Pharmacol. 2022 Jun 5;924:174956. doi: 10.1016/j.ejphar.2022.174956. Epub 2022 Apr 14.
3
A Potential Participant in Type 2 Diabetes Bone Fragility: TIMP-1 at Sites of Osteocyte Lacunar-Canalicular System.
用于糖尿病检测的机器学习模型的可推广性:一项针对北欧胰岛移植和皮马数据集的研究
Sci Rep. 2025 Feb 6;15(1):4479. doi: 10.1038/s41598-025-87471-0.
4
Comprehensive machine learning models for predicting therapeutic targets in type 2 diabetes utilizing molecular and biochemical features in rats.利用大鼠的分子和生化特征预测 2 型糖尿病治疗靶点的综合机器学习模型。
Front Endocrinol (Lausanne). 2024 May 24;15:1384984. doi: 10.3389/fendo.2024.1384984. eCollection 2024.
5
CCDC58 is a potential biomarker for diagnosis, prognosis, immunity, and genomic heterogeneity in pan-cancer.CCDC58 是一种潜在的生物标志物,可用于泛癌的诊断、预后、免疫和基因组异质性。
Sci Rep. 2024 Apr 13;14(1):8575. doi: 10.1038/s41598-024-59154-9.
6
Genetic and Epigenetic Aspects of Type 1 Diabetes Mellitus: Modern View on the Problem.1型糖尿病的遗传和表观遗传因素:对该问题的现代观点
Biomedicines. 2024 Feb 8;12(2):399. doi: 10.3390/biomedicines12020399.
7
Saudi Community-Based Screening Study on Genetic Variants in -Cell Dysfunction and Its Role in Women with Gestational Diabetes Mellitus.沙特基于社区的β细胞功能障碍遗传变异筛查研究及其在妊娠期糖尿病妇女中的作用。
Genes (Basel). 2023 Apr 16;14(4):924. doi: 10.3390/genes14040924.
2型糖尿病骨脆性的潜在参与者:骨细胞陷窝-小管系统部位的金属蛋白酶组织抑制因子-1
Diabetes Metab Syndr Obes. 2021 Dec 23;14:4903-4909. doi: 10.2147/DMSO.S345081. eCollection 2021.
4
Multi-Omics Analysis of Glioblastoma Cells' Sensitivity to Oncolytic Viruses.胶质母细胞瘤细胞对溶瘤病毒敏感性的多组学分析
Cancers (Basel). 2021 Oct 20;13(21):5268. doi: 10.3390/cancers13215268.
5
Repression of HDAC5 by acetate restores hypothalamic-pituitary-ovarian function in type 2 diabetes mellitus.醋酸盐抑制 HDAC5 可恢复 2 型糖尿病的下丘脑-垂体-卵巢功能。
Reprod Toxicol. 2021 Dec;106:69-81. doi: 10.1016/j.reprotox.2021.10.008. Epub 2021 Oct 14.
6
The Roles of Pseudophosphatases in Disease.假磷酸酶在疾病中的作用。
Int J Mol Sci. 2021 Jun 28;22(13):6924. doi: 10.3390/ijms22136924.
7
Medical informatics labor market analysis using web crawling, web scraping, and text mining.医学信息学劳动力市场分析:使用网络爬虫、网络抓取和文本挖掘技术。
Int J Med Inform. 2021 Jun;150:104453. doi: 10.1016/j.ijmedinf.2021.104453. Epub 2021 Apr 8.
8
METTL14-regulated PI3K/Akt signaling pathway via PTEN affects HDAC5-mediated epithelial-mesenchymal transition of renal tubular cells in diabetic kidney disease.METTL14 调控的 PI3K/Akt 信号通路通过 PTEN 影响糖尿病肾病中肾小管细胞的 HDAC5 介导的上皮-间充质转化。
Cell Death Dis. 2021 Jan 4;12(1):32. doi: 10.1038/s41419-020-03312-0.
9
Explainable AI: A Review of Machine Learning Interpretability Methods.可解释人工智能:机器学习可解释性方法综述
Entropy (Basel). 2020 Dec 25;23(1):18. doi: 10.3390/e23010018.
10
The transcriptomic profiling of SARS-CoV-2 compared to SARS, MERS, EBOV, and H1N1.与 SARS、MERS、EBOV 和 H1N1 相比,SARS-CoV-2 的转录组特征分析。
PLoS One. 2020 Dec 10;15(12):e0243270. doi: 10.1371/journal.pone.0243270. eCollection 2020.