Suppr超能文献

共表达网络分析与机器学习验证相结合挖掘类风湿关节炎潜在关键基因

Coupling of Co-expression Network Analysis and Machine Learning Validation Unearthed Potential Key Genes Involved in Rheumatoid Arthritis.

作者信息

Xiao Jianwei, Wang Rongsheng, Cai Xu, Ye Zhizhong

机构信息

Department of Rheumatology and Immunology, Shenzhen Futian Hospital for Rheumatic Diseases, Shenzhen, China.

Department of Rheumatology, Shanghai Guanghua Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai, China.

出版信息

Front Genet. 2021 Feb 11;12:604714. doi: 10.3389/fgene.2021.604714. eCollection 2021.

Abstract

Rheumatoid arthritis (RA) is an incurable disease that afflicts 0.5-1.0% of the global population though it is less threatening at its early stage. Therefore, improved diagnostic efficiency and prognostic outcome are critical for confronting RA. Although machine learning is considered a promising technique in clinical research, its potential in verifying the biological significance of gene was not fully exploited. The performance of a machine learning model depends greatly on the features used for model training; therefore, the effectiveness of prediction might reflect the quality of input features. In the present study, we used weighted gene co-expression network analysis (WGCNA) in conjunction with differentially expressed gene (DEG) analysis to select the key genes that were highly associated with RA phenotypes based on multiple microarray datasets of RA blood samples, after which they were used as features in machine learning model validation. A total of six machine learning models were used to validate the biological significance of the key genes based on gene expression, among which five models achieved good performances [area under curve (AUC) >0.85], suggesting that our currently identified key genes are biologically significant and highly representative of genes involved in RA. Combined with other biological interpretations including Gene Ontology (GO) analysis, protein-protein interaction (PPI) network analysis, as well as inference of immune cell composition, our current study might shed a light on the in-depth study of RA diagnosis and prognosis.

摘要

类风湿性关节炎(RA)是一种无法治愈的疾病,全球有0.5%-1.0%的人口受其困扰,不过在早期阶段它的威胁较小。因此,提高诊断效率和改善预后结果对于应对RA至关重要。虽然机器学习在临床研究中被认为是一种很有前景的技术,但其在验证基因生物学意义方面的潜力尚未得到充分利用。机器学习模型的性能在很大程度上取决于用于模型训练的特征;因此,预测的有效性可能反映了输入特征的质量。在本研究中,我们结合加权基因共表达网络分析(WGCNA)和差异表达基因(DEG)分析,基于多个RA血样微阵列数据集选择与RA表型高度相关的关键基因,然后将其用作机器学习模型验证的特征。总共使用了六种机器学习模型,基于基因表达来验证关键基因的生物学意义,其中五种模型表现良好[曲线下面积(AUC)>0.85],这表明我们目前鉴定出的关键基因具有生物学意义,并且高度代表参与RA的基因。结合包括基因本体论(GO)分析、蛋白质-蛋白质相互作用(PPI)网络分析以及免疫细胞组成推断在内的其他生物学解释,我们目前的研究可能为RA诊断和预后的深入研究提供线索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd23/7905311/4c97fea6682f/fgene-12-604714-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验