Suppr超能文献

机器学习在识别不同心脏细胞类型先天性心脏病的标记基因中的应用

Machine Learning in Identifying Marker Genes for Congenital Heart Diseases of Different Cardiac Cell Types.

作者信息

Ma Qinglan, Zhang Yu-Hang, Guo Wei, Feng Kaiyan, Huang Tao, Cai Yu-Dong

机构信息

School of Life Sciences, Shanghai University, Shanghai 200444, China.

Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.

出版信息

Life (Basel). 2024 Aug 19;14(8):1032. doi: 10.3390/life14081032.

Abstract

Congenital heart disease (CHD) represents a spectrum of inborn heart defects influenced by genetic and environmental factors. This study advances the field by analyzing gene expression profiles in 21,034 cardiac fibroblasts, 73,296 cardiomyocytes, and 35,673 endothelial cells, utilizing single-cell level analysis and machine learning techniques. Six CHD conditions: dilated cardiomyopathy (DCM), donor hearts (used as healthy controls), hypertrophic cardiomyopathy (HCM), heart failure with hypoplastic left heart syndrome (HF_HLHS), Neonatal Hypoplastic Left Heart Syndrome (Neo_HLHS), and Tetralogy of Fallot (TOF), were investigated for each cardiac cell type. Each cell sample was represented by 29,266 gene features. These features were first analyzed by six feature-ranking algorithms, resulting in several feature lists. Then, these lists were fed into incremental feature selection, containing two classification algorithms, to extract essential gene features and classification rules and build efficient classifiers. The identified essential genes can be potential CHD markers in different cardiac cell types. For instance, the LASSO identified key genes specific to various heart cell types in CHD subtypes. was found to be up-regulated in cardiac fibroblasts for both Dilated and hypertrophic cardiomyopathy. In cardiomyocytes, distinct genes such as , , , , and were linked to dilated cardiomyopathy, Neo-Hypoplastic Left Heart Syndrome, hypertrophic cardiomyopathy, HF-Hypoplastic Left Heart Syndrome, and Tetralogy of Fallot, respectively. Endothelial cell analysis further revealed , , and as significant genes for dilated cardiomyopathy, hypertrophic cardiomyopathy, and Tetralogy of Fallot. LightGBM, Catboost, MCFS, RF, and XGBoost further delineated key genes for specific CHD subtypes, demonstrating the efficacy of machine learning in identifying CHD-specific genes. Additionally, this study developed quantitative rules for representing the gene expression patterns related to CHDs. This research underscores the potential of machine learning in unraveling the molecular complexities of CHD and establishes a foundation for future mechanism-based studies.

摘要

先天性心脏病(CHD)是一系列受遗传和环境因素影响的先天性心脏缺陷。本研究通过利用单细胞水平分析和机器学习技术,对21034个心脏成纤维细胞、73296个心肌细胞和35673个内皮细胞的基因表达谱进行分析,推动了该领域的发展。研究针对六种CHD病症:扩张型心肌病(DCM)、供体心脏(用作健康对照)、肥厚型心肌病(HCM)、左心发育不全综合征伴心力衰竭(HF_HLHS)、新生儿左心发育不全综合征(Neo_HLHS)和法洛四联症(TOF),对每种心脏细胞类型进行了调查。每个细胞样本由29266个基因特征表示。这些特征首先通过六种特征排名算法进行分析,得到几个特征列表。然后,将这些列表输入包含两种分类算法的增量特征选择中,以提取关键基因特征和分类规则并构建高效分类器。所确定的关键基因可能是不同心脏细胞类型中潜在的CHD标志物。例如,套索(LASSO)算法确定了CHD亚型中各种心脏细胞类型特有的关键基因。发现该基因在扩张型和肥厚型心肌病的心脏成纤维细胞中上调。在心肌细胞中,不同的基因如[具体基因1]、[具体基因2]、[具体基因3]、[具体基因4]和[具体基因5]分别与扩张型心肌病、新生儿左心发育不全综合征、肥厚型心肌病、HF-左心发育不全综合征和法洛四联症相关联。内皮细胞分析进一步揭示[具体基因6]、[具体基因7]和[具体基因8]是扩张型心肌病、肥厚型心肌病和法洛四联症的重要基因。LightGBM、Catboost、MCFS、RF和XGBoost进一步描绘了特定CHD亚型的关键基因,证明了机器学习在识别CHD特异性基因方面的有效性。此外,本研究制定了用于表示与CHD相关的基因表达模式的定量规则。这项研究强调了机器学习在揭示CHD分子复杂性方面的潜力,并为未来基于机制的研究奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa8f/11355424/bacc9ee492a6/life-14-01032-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验