Suppr超能文献

基于机器学习的椎间盘退变新治疗靶点分析的淋巴管相关基因诊断模型

Machine learning-based diagnostic model of lymphatics-associated genes for new therapeutic target analysis in intervertebral disc degeneration.

作者信息

Lin Maoqiang, Li Shaolong, Wang Yabin, Zheng Guan, Hu Fukang, Zhang Qiang, Song Pengjie, Zhou Haiyu

机构信息

Department of Orthopedics, The Second Hospital & Clinical Medical School, Lanzhou University, Lanzhou, Gansu, China.

Key Laboratory of Bone and Joint Disease Research of Gansu Province, Lanzhou, Gansu, China.

出版信息

Front Immunol. 2024 Dec 4;15:1441028. doi: 10.3389/fimmu.2024.1441028. eCollection 2024.

Abstract

BACKGROUND

Low back pain resulting from intervertebral disc degeneration (IVDD) represents a significant global social problem. There are notable differences in the distribution of lymphatic vessels (LV) in normal and pathological intervertebral discs. Nevertheless, the molecular mechanisms of lymphatics-associated genes (LAGs) in the development of IVDD remain unclear. An in-depth exploration of this area will help to reveal the biological and clinical significance of LAGs in IVDD and may lead to the search for new therapeutic targets for IVDD.

METHODS

Data sets were obtained from the Gene Expression Omnibus (GEO) database. Following quality control and normalization, the datasets (GSE153761, GSE147383, and GSE124272) were merged to form the training set, with GSE150408 serving as the validation set. LAGs from GeneCards, MSigDB, Gene Ontology, and KEGG database. The Venn diagram was employed to identify differentially expressed lymphatic-associated genes (DELAGs) that were differentially expressed in the normal and IVDD groups. Subsequently, four machine learning algorithms (SVM-RFE, Random Forest, XGB, and GLM) were used to select the method to construct the diagnostic model. The receiver operating characteristic (ROC) curve, nomogram, and Decision Curve Analysis (DCA) were used to evaluate the model effect. In addition, we constructed a potential drug regulatory network and competitive endogenous RNA (ceRNA) network for key LAGs.

RESULTS

A total of 15 differentially expressed LAGs were identified. By comparing four machine learning methods, the top five genes of importance in the XGB model (MET, HHIP, SPRY1, CSF1, TOX) were identified as lymphatics-associated gene diagnostic signatures. This signature was used to predict the diagnosis of IVDD with strong accuracy and an area under curve (AUC) value of 0.938. Furthermore, the diagnostic model was validated in an external dataset (GSE150408), with an AUC value of 0.772. The nomogram and DCA further prove that the diagnosis model has good performance and predictive value. Additionally, drug regulatory networks and ceRNA networks were constructed, revealing potential therapeutic drugs and post-transcriptional regulatory mechanisms.

CONCLUSION

We developed and validated a lymphatics-associated genes diagnostic model by machine learning algorithms that effectively identify IVDD patients. These five key LAGs may be potential therapeutic targets for IVDD patients.

摘要

背景

椎间盘退变(IVDD)引起的腰痛是一个重大的全球性社会问题。正常和病理性椎间盘中淋巴管(LV)的分布存在显著差异。然而,淋巴管相关基因(LAGs)在IVDD发生发展中的分子机制仍不清楚。深入探索这一领域将有助于揭示LAGs在IVDD中的生物学和临床意义,并可能促使寻找IVDD的新治疗靶点。

方法

从基因表达综合数据库(GEO)获取数据集。经过质量控制和标准化后,将数据集(GSE153761、GSE147383和GSE124272)合并形成训练集,GSE150408作为验证集。从GeneCards、MSigDB、基因本体论和KEGG数据库中获取LAGs。采用维恩图确定在正常组和IVDD组中差异表达的淋巴管相关基因(DELAGs)。随后,使用四种机器学习算法(支持向量机-递归特征消除法、随机森林、极端梯度提升和广义线性模型)来选择构建诊断模型的方法。采用受试者工作特征(ROC)曲线、列线图和决策曲线分析(DCA)来评估模型效果。此外,我们为关键LAGs构建了潜在的药物调控网络和竞争性内源性RNA(ceRNA)网络。

结果

共鉴定出15个差异表达的LAGs。通过比较四种机器学习方法,确定了极端梯度提升模型中最重要的前五个基因(MET、HHIP、SPRY1、CSF1、TOX)作为淋巴管相关基因诊断特征。该特征用于预测IVDD的诊断,具有很高的准确性,曲线下面积(AUC)值为0.938。此外,在外部数据集(GSE150408)中验证了诊断模型,AUC值为0.772。列线图和DCA进一步证明诊断模型具有良好的性能和预测价值。此外,构建了药物调控网络和ceRNA网络,揭示了潜在的治疗药物和转录后调控机制。

结论

我们通过机器学习算法开发并验证了一种淋巴管相关基因诊断模型,该模型可有效识别IVDD患者。这五个关键的LAGs可能是IVDD患者潜在的治疗靶点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54d1/11652530/8742fbf83c71/fimmu-15-1441028-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验