Department of Laboratory Medicine, West China Second University Hospital, Sichuan University, No. 20, Section 3, Renmin South Road, Chengdu, 610041, PR, Sichuan Province, China.
Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China.
BMC Pediatr. 2022 Aug 30;22(1):512. doi: 10.1186/s12887-022-03557-y.
Kawasaki disease (KD), characterized by systemic vasculitis, is the leading cause of acquired heart disease in children. Herein, we developed a diagnostic model, with some prognosis ability, to help distinguish children with KD.
Gene expression datasets were downloaded from Gene Expression Omnibus (GEO), and gene sets with a potential pathogenic mechanism in KD were identified using differential expressed gene (DEG) screening, pathway enrichment analysis, random forest (RF) screening, and artificial neural network (ANN) construction.
We extracted 2,017 DEGs (1,130 with upregulated and 887 with downregulated expression) from GEO. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses showed that the DEGs were significantly enriched in innate/adaptive immune response-related processes. Subsequently, the results of weighted gene co-expression network analysis and DEG screening were combined and, using RF and ANN, a model with eight genes (VPS9D1, CACNA1E, SH3GLB1, RAB32, ADM, GYG1, PGS1, and HIST2H2AC) was constructed. Classification results of the new model for KD diagnosis showed excellent performance for different datasets, including those of patients with KD, convalescents, and healthy individuals, with area under the curve values of 1, 0.945, and 0.95, respectively.
We used machine learning methods to construct and validate a diagnostic model using multiple bioinformatic datasets, and identified molecules expected to serve as new biomarkers for or therapeutic targets in KD.
川崎病(KD)以全身血管炎为特征,是儿童获得性心脏病的主要原因。在此,我们开发了一个具有一定预后能力的诊断模型,以帮助区分 KD 患儿。
从基因表达综合数据库(GEO)中下载基因表达数据集,通过差异表达基因(DEG)筛选、通路富集分析、随机森林(RF)筛选和人工神经网络(ANN)构建,鉴定与 KD 潜在致病机制相关的基因集。
从 GEO 中提取了 2017 个 DEG(1130 个上调,887 个下调)。基因本体论(GO)和京都基因与基因组百科全书(KEGG)分析表明,DEG 显著富集于固有/适应性免疫反应相关过程。随后,加权基因共表达网络分析和 DEG 筛选的结果相结合,使用 RF 和 ANN,构建了一个包含 8 个基因(VPS9D1、CACNA1E、SH3GLB1、RAB32、ADM、GYG1、PGS1 和 HIST2H2AC)的模型。KD 诊断新模型的分类结果对不同数据集(包括 KD 患者、恢复期患者和健康个体)均表现出优异的性能,曲线下面积分别为 1、0.945 和 0.95。
我们使用机器学习方法,构建并验证了一个基于多个生物信息学数据集的诊断模型,鉴定了可能作为 KD 新生物标志物或治疗靶点的分子。