Suppr超能文献

综合转录组学、蛋白质组学和机器学习方法,利用瓣膜性心脏病患者的心房样本识别心房颤动的特征基因。

Integrative transcriptomic, proteomic, and machine learning approach to identifying feature genes of atrial fibrillation using atrial samples from patients with valvular heart disease.

机构信息

Department of Cardiovascular Medicine/Cardiac Catheterization Lab, Second Xiangya Hospital, Central South University, No. 139 Middle Renmin Road, Changsha, 410011, Hunan Province, People's Republic of China.

Department of Dermatology, Xiangya Hospital, Central South University, Changsha, Hunan Province, People's Republic of China.

出版信息

BMC Cardiovasc Disord. 2021 Jan 28;21(1):52. doi: 10.1186/s12872-020-01819-0.

Abstract

BACKGROUND

Atrial fibrillation (AF) is the most common arrhythmia with poorly understood mechanisms. We aimed to investigate the biological mechanism of AF and to discover feature genes by analyzing multi-omics data and by applying a machine learning approach.

METHODS

At the transcriptomic level, four microarray datasets (GSE41177, GSE79768, GSE115574, GSE14975) were downloaded from the Gene Expression Omnibus database, which included 130 available atrial samples from AF and sinus rhythm (SR) patients with valvular heart disease. Microarray meta-analysis was adopted to identified differentially expressed genes (DEGs). At the proteomic level, a qualitative and quantitative analysis of proteomics in the left atrial appendage of 18 patients (9 with AF and 9 with SR) who underwent cardiac valvular surgery was conducted. The machine learning correlation-based feature selection (CFS) method was introduced to selected feature genes of AF using the training set of 130 samples involved in the microarray meta-analysis. The Naive Bayes (NB) based classifier constructed using training set was evaluated on an independent validation test set GSE2240.

RESULTS

863 DEGs with FDR < 0.05 and 482 differentially expressed proteins (DEPs) with FDR < 0.1 and fold change > 1.2 were obtained from the transcriptomic and proteomic study, respectively. The DEGs and DEPs were then analyzed together which identified 30 biomarkers with consistent trends. Further, 10 features, including 8 upregulated genes (CD44, CHGB, FHL2, GGT5, IGFBP2, NRAP, SEPTIN6, YWHAQ) and 2 downregulated genes (TNNI1, TRDN) were selected from the 30 biomarkers through machine learning CFS method using training set. The NB based classifier constructed using the training set accurately and reliably classify AF from SR samples in the validation test set with a precision of 87.5% and AUC of 0.995.

CONCLUSION

Taken together, our present work might provide novel insights into the molecular mechanism and provide some promising diagnostic and therapeutic targets of AF.

摘要

背景

心房颤动(AF)是最常见的心律失常,其发病机制尚不清楚。我们旨在通过分析多组学数据并应用机器学习方法,研究 AF 的生物学机制并发现特征基因。

方法

在转录组水平上,从基因表达综合数据库中下载了四个微阵列数据集(GSE41177、GSE79768、GSE115574、GSE14975),其中包括 130 例来自瓣膜性心脏病的 AF 和窦性节律(SR)患者的可用心房样本。采用微阵列荟萃分析鉴定差异表达基因(DEGs)。在蛋白质组学水平上,对 18 例(9 例 AF 和 9 例 SR)接受心脏瓣膜手术的患者的左心耳进行蛋白质组学定性和定量分析。采用基于机器学习的相关性特征选择(CFS)方法,从微阵列荟萃分析中包含的 130 个样本的训练集中选择 AF 的特征基因。使用训练集构建的基于朴素贝叶斯(NB)的分类器在独立验证测试集 GSE2240 上进行评估。

结果

从转录组和蛋白质组学研究中分别获得了 FDR<0.05 的 863 个差异表达基因(DEGs)和 FDR<0.1 和倍数变化>1.2 的 482 个差异表达蛋白(DEPs)。然后对 DEGs 和 DEPs 进行了综合分析,确定了 30 个具有一致趋势的生物标志物。进一步通过使用训练集的基于机器学习的 CFS 方法从 30 个生物标志物中选择了 10 个特征,包括 8 个上调基因(CD44、CHGB、FHL2、GGT5、IGFBP2、NRAP、SEPTIN6、YWHAQ)和 2 个下调基因(TNNI1、TRDN)。使用训练集构建的基于 NB 的分类器在验证测试集中准确可靠地将 AF 与 SR 样本分类,准确率为 87.5%,AUC 为 0.995。

结论

综上所述,我们的研究工作可能为 AF 的分子机制提供新的见解,并为 AF 的诊断和治疗提供一些有前途的靶点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f854/7842070/a1de51bf3fb3/12872_2020_1819_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验