Department of Geriatrics, Renmin Hospital of Wuhan University, Wuhan, 430060, Hubei, China.
Department of Cardiology, Renmin Hospital of Wuhan University, Wuhan, 430060, Hubei, China.
BMC Bioinformatics. 2023 May 12;24(1):196. doi: 10.1186/s12859-023-05244-w.
Atherosclerosis is the common pathological basis for many cardiovascular and cerebrovascular diseases. The purpose of this study is to identify the diagnostic biomarkers related to atherosclerosis through machine learning algorithm.
Clinicopathological parameters and transcriptomics data were obtained from 4 datasets (GSE21545, GSE20129, GSE43292, GSE100927). A nonnegative matrix factorization algorithm was used to classify arteriosclerosis patients in GSE21545 dataset. Then, we identified prognosis-related differentially expressed genes (DEGs) between the subtypes. Multiple machine learning methods to detect pivotal markers. Discrimination, calibration and clinical usefulness of the predicting model were assessed using area under curve, calibration plot and decision curve analysis respectively. The expression level of the feature genes was validated in GSE20129, GSE43292, GSE100927.
2 molecular subtypes of atherosclerosis was identified, and 223 prognosis-related DEGs between the 2 subtypes were identified. These genes are not only related to epithelial cell proliferation, mitochondrial dysfunction, but also to immune related pathways. Least absolute shrinkage and selection operator, random forest, support vector machine- recursive feature elimination show that IL17C and ACOXL were identified as diagnostic markers of atherosclerosis. The prediction model displayed good discrimination and good calibration. Decision curve analysis showed that this model was clinically useful. Moreover, IL17C and ACOXL were verified in other 3 GEO datasets, and also have good predictive performance.
IL17C and ACOXL were diagnostic genes of atherosclerosis and associated with higher incidence of ischemic events.
动脉粥样硬化是许多心脑血管疾病的共同病理基础。本研究旨在通过机器学习算法确定与动脉粥样硬化相关的诊断生物标志物。
从 4 个数据集(GSE21545、GSE20129、GSE43292、GSE100927)中获取临床病理参数和转录组学数据。使用非负矩阵分解算法对 GSE21545 数据集的动脉粥样硬化患者进行分类。然后,我们鉴定了亚型之间与预后相关的差异表达基因(DEGs)。使用多种机器学习方法检测关键标记物。使用曲线下面积、校准图和决策曲线分析分别评估预测模型的判别、校准和临床实用性。在 GSE20129、GSE43292、GSE100927 中验证特征基因的表达水平。
鉴定出 2 种动脉粥样硬化分子亚型,鉴定出 2 种亚型之间 223 个与预后相关的 DEGs。这些基因不仅与上皮细胞增殖、线粒体功能障碍有关,而且与免疫相关途径有关。最小绝对收缩和选择算子、随机森林、支持向量机-递归特征消除显示,IL17C 和 ACOXL 被鉴定为动脉粥样硬化的诊断标志物。预测模型具有良好的判别能力和良好的校准能力。决策曲线分析表明该模型具有临床实用性。此外,IL17C 和 ACOXL 在另外 3 个 GEO 数据集也得到了验证,具有良好的预测性能。
IL17C 和 ACOXL 是动脉粥样硬化的诊断基因,与更高的缺血事件发生率相关。