Chen Haiqing, Lai Haotian, Chi Hao, Fan Wei, Huang Jinbang, Zhang Shengke, Jiang Chenglu, Jiang Lai, Hu Qingwen, Yan Xiuben, Chen Yemeng, Zhang Jieying, Yang Guanhu, Liao Bin, Wan Juyi
School of Clinical Medicine, The Affiliated Hospital, Southwest Medical University, Luzhou, China.
Metabolic Vascular Diseases Key Laboratory of Sichuan Province, Key Laboratory of Cardiovascular Remodeling and Dysfunction, Department of Cardiovascular Surgery, The Affiliated Hospital, Southwest Medical University, Luzhou, China.
Front Cardiovasc Med. 2024 Nov 26;11:1397407. doi: 10.3389/fcvm.2024.1397407. eCollection 2024.
Atherosclerosis, a complex chronic vascular disorder with multifactorial etiology, stands as the primary culprit behind consequential cardiovascular events, imposing a substantial societal and economic burden. Nevertheless, our current understanding of its pathogenesis remains imprecise. In this investigation, our objective is to establish computational models elucidating molecular-level markers associated with atherosclerosis. This endeavor involves the integration of advanced machine learning techniques and comprehensive bioinformatics analyses.
Our analysis incorporated data from three publicly available the Gene Expression Omnibus (GEO) datasets: GSE100927 (104 samples, 30,558 genes), which includes atherosclerotic lesions and control arteries from carotid, femoral, and infra-popliteal arteries of deceased organ donors; GSE43292 (64 samples, 23,307 genes), consisting of paired carotid endarterectomy samples from 32 hypertensive patients, comparing atheroma plaques and intact tissues; and GSE159677 (30,498 single cells, 33,538 genes), examining single-cell transcriptomes of calcified atherosclerotic core plaques and adjacent carotid artery tissues from patients undergoing carotid endarterectomy. Utilizing single-cell sequencing, highly variable atherosclerotic monocyte subpopulations were systematically identified. We analyzed cellular communication patterns with temporal dynamics. The bioinformatics approach Weighted Gene Co-expression Network Analysis (WGCNA) identified key modules, constructing a Protein-Protein Interaction (PPI) network from module-associated genes. Three machine-learning models derived marker genes, formulated through logistic regression and validated via convolutional neural network(CNN) modeling. Subtypes were clustered based on Gene Set Variation Analysis (GSVA) scores, validated through immunoassays.
Three pivotal atherosclerosis-associated genes-CD36, S100A10, CSNK1A1-were unveiled, offering valuable clinical insights. Profiling based on these genes delineated two distinct isoforms: C2 demonstrated potent microbicidal activity, while C1 engaged in inflammation regulation, tissue repair, and immune homeostasis. Molecular docking analyses explored therapeutic potential for Estradiol, Zidovudine, Indinavir, and Dronabinol for clinical applications.
This study introduces three signature genes for atherosclerosis, shaping a novel paradigm for investigating clinical immunological medications. It distinguishes the high biocidal C2 subtype from the inflammation-modulating C1 subtype, utilizing identified signature gene as crucial targets.
动脉粥样硬化是一种病因多因素的复杂慢性血管疾病,是导致心血管事件的主要元凶,给社会和经济带来了沉重负担。然而,我们目前对其发病机制的理解仍不准确。在本研究中,我们的目标是建立计算模型,以阐明与动脉粥样硬化相关的分子水平标志物。这项工作涉及先进机器学习技术与综合生物信息学分析的整合。
我们的分析纳入了来自三个公开的基因表达综合数据库(GEO)数据集的数据:GSE100927(104个样本,30558个基因),其中包括来自已故器官捐献者颈动脉、股动脉和腘下动脉的动脉粥样硬化病变和对照动脉;GSE43292(64个样本,23307个基因),由32名高血压患者的配对颈动脉内膜切除术样本组成,比较动脉粥样斑块和完整组织;以及GSE159677(30498个单细胞,33538个基因),研究接受颈动脉内膜切除术患者的钙化动脉粥样硬化核心斑块和相邻颈动脉组织的单细胞转录组。利用单细胞测序,系统地鉴定了高度可变的动脉粥样硬化单核细胞亚群。我们分析了具有时间动态的细胞通讯模式。生物信息学方法加权基因共表达网络分析(WGCNA)确定了关键模块,从模块相关基因构建了蛋白质-蛋白质相互作用(PPI)网络。三种机器学习模型推导了标记基因,通过逻辑回归制定并通过卷积神经网络(CNN)建模进行验证。基于基因集变异分析(GSVA)分数对亚型进行聚类,并通过免疫测定进行验证。
揭示了三个与动脉粥样硬化相关的关键基因——CD36、S100A10、CSNK1A1,为临床提供了有价值的见解。基于这些基因的分析描绘了两种不同的亚型:C2表现出强大的杀菌活性,而C1参与炎症调节、组织修复和免疫稳态。分子对接分析探索了雌二醇、齐多夫定、茚地那韦和屈大麻酚在临床应用中的治疗潜力。
本研究介绍了三个动脉粥样硬化的特征基因,为研究临床免疫药物塑造了一种新的模式。它利用鉴定出的特征基因作为关键靶点,区分了高杀菌性的C2亚型和炎症调节性的C1亚型。